Google 



This is a digital copy of a book that was preserved for generations on library shelves before it was carefully scanned by Google as part of a project 

to make the world's books discoverable online. 

It has survived long enough for the copyright to expire and the book to enter the public domain. A public domain book is one that was never subject 

to copyright or whose legal copyright term has expired. Whether a book is in the public domain may vary country to country. Public domain books 

are our gateways to the past, representing a wealth of history, culture and knowledge that's often difficult to discover. 

Marks, notations and other maiginalia present in the original volume will appear in this file - a reminder of this book's long journey from the 

publisher to a library and finally to you. 

Usage guidelines 

Google is proud to partner with libraries to digitize public domain materials and make them widely accessible. Public domain books belong to the 
public and we are merely their custodians. Nevertheless, this work is expensive, so in order to keep providing tliis resource, we liave taken steps to 
prevent abuse by commercial parties, including placing technical restrictions on automated querying. 
We also ask that you: 

+ Make non-commercial use of the files We designed Google Book Search for use by individuals, and we request that you use these files for 
personal, non-commercial purposes. 

+ Refrain fivm automated querying Do not send automated queries of any sort to Google's system: If you are conducting research on machine 
translation, optical character recognition or other areas where access to a large amount of text is helpful, please contact us. We encourage the 
use of public domain materials for these purposes and may be able to help. 

+ Maintain attributionTht GoogXt "watermark" you see on each file is essential for in forming people about this project and helping them find 
additional materials through Google Book Search. Please do not remove it. 

+ Keep it legal Whatever your use, remember that you are responsible for ensuring that what you are doing is legal. Do not assume that just 
because we believe a book is in the public domain for users in the United States, that the work is also in the public domain for users in other 
countries. Whether a book is still in copyright varies from country to country, and we can't offer guidance on whether any specific use of 
any specific book is allowed. Please do not assume that a book's appearance in Google Book Search means it can be used in any manner 
anywhere in the world. Copyright infringement liabili^ can be quite severe. 

About Google Book Search 

Google's mission is to organize the world's information and to make it universally accessible and useful. Google Book Search helps readers 
discover the world's books while helping authors and publishers reach new audiences. You can search through the full text of this book on the web 

at |http: //books .google .com/I 



f 



UNIVERSITY OP PENNSYLVANIA 






THE COMPETENCY OF FIFTY COLLEGE 

STUDENTS 

(A DIAGNOSTIC STUDY) 



BY 
KARL GREENWOOD! MILLER 



A THESIS 

PRB8ENTBD TO THE FACULTY OP THE GRADUATE SCHOOL IN PARTIAL 

FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE 

OF DOCTOR OF PHILOSOPHY IN PSYCHOLOGY 



PHILADELPHIA 
1922 



ll 



v\,^ 



) 



'1 •. *■ 



r • • * . . . 









i -v.. 



/'.-'' 



.<< '^ 



t 



A' 



• 



THE COMPETENCY OF FIFTY COLLEGE STUDENTS. 

(A Diagnostic Study.) 

NOTE 

This Thesis will be found reprinted as No. VIII of 

EXPEBIMENTAL StTJDIBS IN PSYCHOLOGY AND PEDAGOGY 

Intboduction. 

No task more worthy of attention confronts the psychologist 
today than the scientific study of the college student by means of 
mental tests. 

Psychological tests were first employed in the examination and 
segregation of the mentally feeble. A large number of clinics con- 
nected with modem school S3rstems, hospitals, or juvenile courts 
have found these tests of service in detecting mental subnormality. 
It has only been in the last decade, however, that the possibilities 
of the psychological examination of ''normal" individuals have been 
recognized, and rapid advances are now being made in this field. 
The success with which mental tests were used in the classification 
and stratification of the great mass of men who formed our National 
Army probably did more to bring about a general acceptance of the 
method and principles involved than would have resulted from 
many years of experimentation in peace times. Today, psychological 
tests are used not only in the field of education but also form an in- 
tegral part of the selective and administrative machinery of many 
large industrial organizations. The present vogue of the mental 
test carries with it one real danger in that the uninitiated are likely 
to demand more of the psychologist than he can give. 

Without doubt it is now possible to say, as a result of a psycho- 
logical examination, that one individual possesses too little mentality 
to admit of his being a self-supporting member of society, that 
another can be trained to perform a simple task satisfactorily, that 
a third has ability which will enable him to fill a place in the great 
middle class, while still another has intellectual endowments which 
should lead him into the fields of higher education and professional 
activity. These broad classifications can be made through the 
employment of many and various tests which have been carefully 
devised and scientifically standardized. With the concept of differing 
levels of general intelligence fairly well developed the psychologist 
now faces the task of classif3ring individuals. When the attempt is 
made not only to ascertain the general performance level but also to 
determine for what occupation the specific abilities of the individual 
best fits him, the difficulty of the problem is tremendously increased. 
Shall the man of small competency be a ditch-digger or a stevedore? 







Is the citizen of mediocre ability best qualified to follow the Tocation 
of motonnany mechanic or clerk? Should the college student be 
guided into industry, law or teaching? 

These questions imply that the pcfychologist must also function 
as a vocational adviser, and while this obligation may not at present 
be generally accepted, the implication is nevertheless warranted. 
Mental tests, if they are to be of value to society, must lead to 
prognoses as well as to diagnoses and must at least offer to the indi- 
vidual tested some information which may be useful in the attain- 
ment of greater personal and social efficiency. In much the same 
manner as the employment manager of today places the applicant 
in some particular position in his organization, so the p^cdiologist 
of the future may find it possible to direct each member of society 
to the one vocation which will best utilize his peculiar qualifications. 

It is hardly necessary to point out that the problem of differen- 
tiation becomes increasingly complex as the higher levels of intellec- 
tual organization are approached. The idiot may be consigned to 
custodial care with but small probability of error. The stevedore, 
the scavenger, and the ditch-digger gravitate to their respective 
occupations without perceptible friction. The '' common people" 
present a more difficult problem in view of their higher level of 
performance and greater complexity of response, but even here note- 
worthy advances have been made in recent years through the intro- 
duction of vocational guidance and the application of psychological 
principles to industrial management. Although investigation of 
this character has hardly passed beyond the experimental stagOf a 
beginning has nevertheless been made, and remarkable developments 
during the next decade may be confidently anticipated. 

The task of differentiating the particular abilities required of 
the successful pliunber, mechanic, clerk, motorman, and telephone 
operator — to mention only a few of the almost countless range of 
occupations — is doubtless a difficult one, but it hardly approi^^hes the 
complexity of the problem presented in the guidance of individuals of 
greater intelligence and higher intellectual organization to the one 
vocation for which each is best fitted. While interest, perscniality, 
and various external circumstances can not be disregarded as impor- 
tant factors in the selection of the life work, the concern of the 
psychologist lies primarily in the determination of the specific 
abilities requisite to each type of professional activity, and in the 
scientific evaluation of the particular abilities possessed by each 
individual. It is with the latter phase of the problem that this inves- 
tigation will deal, the interest being centered on the college student, 
who, despite his many shortcomings, must be regarded as representa- 
tive of the highest intellectual type of yoimg manhood in the country. 



Historical. 

The attempt to appraise the undergraduate by means of mental 
tests must not be considered a new departure in the field of psy- 
chology. The credit for the first scientific study of the American 
college student goes to J. McKean Cattell. Stimulated by his 
researches in the anthropometric laboratory of Francis Galton, he 
inaugurated in 1887 a series of experiments with imdergraduates at 
Harvard University, which investigation he continued at the Uni- 
versity of Pennsylvania and Bryn Mawr College in 1888 and 1889, 
and in the following years at Columbia University. The report 
entitled, ''Physical and Mental Measurements of the Students of 
Columbia University", which appeared in the Psychological Review 
for November, 1896, and in which Professor Cattell collaborated 
with Dr. Livingston Farrand, was probably the first publication of 
the results of a systematic study of the mental status of the college 
student. This report is of peculiar interest today not only because 
of its scope, but also in view of the surprising number of mental and 
physical tests actually employed or suggested at that time which 
now constitute the accepted instruments of every clinical psychologist. 
While the purpose of the investigation was necessarily the establish- 
ment of norms by the statistical treatment of the test results of one 
himdred students, and the aim of the present study is rather the 
observation of individual variation, it wiU nevertheless be of interest 
to indicate briefly the character of the information recorded by 
Cattell. Anthropometric measurements such as height, weight, and 
cephalic diametero were noted, and in addition such physiognomic 
characters as the color of hair and eyes, and the size and shape of 
ears. In addition, psychophysical determinations of visual and 
auditory acuity, sensitivity to pain, and various types of reaction 
time were made, as well as tests of a more strictly psychological 
nature which included memory of drawn lines, memory of numbers 
heard, cancellation test, color preference, types of imagery, and 
others. 

The investigation imder consideration was carried on during 
the academic years of 1894-95 and 1895-96, and it is of interest to 
note that the results were published so as to" be of assistance to a 
committee appointed at the annual meeting of the American Psy- 
chological Association held at Philadelphia in December, 1895, to 
consider the feasibility of co-operation among the various psycho- 
logical laboratories in the collection of mental and physical statistics. 
This "Committee on Mental and Physical Tests", which consisted 
of Professors Cattell, Baldwin, Jastrow, Sanford, and Witmer, may 
well be said to have laid the foundation for all subsequent develop- 



6 

ments in the realm of psychological tests in its report to the Psy- 
chological Association at the meeting held in Boston in 1896. This 
report may be found in the Psychological Review of March of the 
following year. 

Having thus briefly indicated the inception of the present field 
of investigation, it would be a thankless task to trace its history down 
to the present moment in any adequate manner. Studies of this 
character have been carried on in every psychological laboratory 
connected with a college or university, and a complete bibliography 
of the reports on the subject would cover many pages. It will be 
well, however, to mention a few of the more important investigations 
which have a direct bearing on the present problem, in so far as it 
concerns the correlation of test results with academic standing. 
Wissler (1) correlated the results published by Cattell and Farrand, 
to which reference has been made above, with the university grades 
assigned to the hundred students imder consideration. Calfee (2) 
has reported on "Four General Intelligence Tests" given to approx- 
imately one hundred students at the University of Texas. Similar 
investigations have been made by Rowland and Lowden (3) at 
Reed College, Waugh (4) at Beloit College, and by Eitson (5) at the 
University of Chicago. The latter study is particularly worthy of 
note in that a very careful and intensive examination of forty students 
was made. King and McCrory (6) report the results of tests on 
five himdred freshmen at the University of Iowa, Caldwell (7) has 
correlated the Intelligence Quotient of approximately one himdred 
students at Randolph-Macon Woman's College, as determined by 
the Adult Tests of the Stanford Revision, with college grades, and 
Rogers (8) gives interesting results of her investigation at Goucher 
College. In the reports mentioned above, Kitson and Caldwell also 
record correlations between test results and estimated intelligence, 
which will be referred to later in this discussion. Incomplete as is 
the preceding sketch, it nevertheless gives some indication of the 
wide-spread interest in the application of mental tests to the college 
student. In this connection it will likewise be well to refer to the 
comparatively recent development in the field of psychological en- 
trance examinations,* which are now demonstrating their practi- 
cability in a number of the larger imiversities, and which constitute 
a further ramification of the same problem. 

Experimental Conditions. 

Stated briefly, the aim of the present study is to examine certain 
data which have been collected relative to each member of the class 
in elementary psychology at the University of Pennsylvania during 



the academic year 1919-20. This information consists of the score 
obtained in a ^'^neral. intelligence examination", the results of a 
series of psychological tests, a rating on estimated competency, and 
a rating based on the academic standing of the individual as de- 
termined by the final grades received in all courses completed at the 
University. The treatment of results will be concerned with the 
e3Eamination of correlations existing between the various ratings under 
consideration, and with the scrutiny of the individual record with a 
view to reaching, if possible, some conclusions which might be of 
assistance to the student in the direction of his intellectual 
development. 

The investigation differs from many which have preceded it, in 
that the psychological tests, with one exception, were given as a 
part of the ordinaiy class mstruction and therefore not primarily 
as tests. The elementary work in psychology consists of two courses 
known as Psychology 1 and 2, each requiring five hours of class 
attendance and continuing throughout one semester. Since credit 
in Psychology 1 is prerequisite to admission into Psychology 2, 
the two courses may be considered as a single introductory course 
lasting through the full academic year. Of the five hours of class 
attendance per week, only one hour is occupied by a formal lectiu^, 
the remaining four hours being devoted to laboratory work. During 
the first semester a number of mental tests are given as a part of the 
laboratory work and with the purpose of graphically demonstrating 
the various factors which fimction in the formation and development 
of the intellect. It is beheved that this method enables the student 
better to understand and appreciate the particular ability or men- 
tal process under discussion. ' It is not claimed, therefore, that the 
series of tests employed would necessarily have been chosen had the 
purpose been the psychological examination and diagnosis of the 
individual to the exclusion of other considerations. However, the 
tests unquestionably provide a very satisfactory framework upon 
which to build a logical presentation of systematic psychology as 
well as offering a medium for the demonstration of fundamental 
psychological processes. In addition, the tests are extremely valuable 
to the student, in that they enable him to determine his peculiar 
mental assets and liabilities through a comparison of his individual 
results with accepted standards or class distributions. 

Since the tests imder consideration were given as a part of the 
usual classroom procedure, the scientifically controlled conditions 
which are generally regarded as indispensable to a psychological 
investigation of this character were for the most part lacking. As 
the class in Psychology 1 numbered more than two hundred students. 



8 

the laboratory work was conducted in three sections with an average 
enroUment of approximately seventy. These three sections all met 
in the same room, one being held at eight-thirty o'clock in the morn- 
ing, another at two in the afternoon, and the third at three o'clock 
on a different afternoon. While the time of meeting was constant 
for each section, the variation in hour possibly affected the com- 
parability of section results. With such a large number of students 
in a laboratory class, some were necessarily seated at a greater 
distance from the instructor than others and in addition a few were 
near windows which may have provided distraction of one kind or 
another. In some cases, the same test was given to the three sections 
by different experimenters, and although the attempt was made to 
adhere as closely as conditions would permit to the standard pro- 
cedure, this variant may have affected the results to some extent. 
In summary, lack of uniformity in the time of meeting of the different 
sections, in the seating arrangement of the classroom, and in the 
identity of the experimenter may be considered factors which expose 
this investigation to criticism as being imscientifically conceived and 
prosecuted. 

The comparative absence of controlled experimental conditions, 
however, cannot be said to invalidate the results. It is an open 
question whether the environment imposed upon a subject by 
scientifically controlled conditions elicits a more representative sam- 
ple of behavior than that produced imder less artificial circumstances. 
Is the psychologist more interested in the reaction of a subject who 
has been isolated in a soimd-proof cabinet with a screen before his 
eyes to eliminate distracting visual stimuli, or in the behavior of the 
same mdividual a^ displayed in natural association with his fellows? 
For some, the classroom would provide as unnatural an environment 
as any that the experimentalist might impose, but for a group of 
university students no moro satisfactory and less distractii^ atmos- 
phere could be selected than that of the rocitation hall or laboratory. 
It is contended, therefore, that the experimental results here 
presented provide an index of the mental status of the college 
student as reliable as any that might have been obtained imder 
other conditions. 

Having thus disposed in a somewhat arbitrary manner of any 
criticisms which might be voiced against the general procedure fol- 
lowed in this investigation, it will be well to consider the treatment 
of the data collected before xmdertaking a description of the specific 
tests employed. As has been indicated, the information available 
concerning each member of the group here studied consists chieflj' 
of the results of a series of mental tests and the academic record of 



9 

the student as displayed in his college grades. The problem of 
devising some statistical method by which the various scores and 
grades may be made easily comparable is immediately encountered. 
For example, a member of the class might have obtained a score of 
131 in the general intelligence examination, a time rating of forty- 
three seconds in a mechanical test, and he may have an audito- 
graphic memory span of eight digits as well as a number of other 
test results. In addition, his college record may show that he has 
received the highest grade in 10 per cent of his academic work, a 
passing grade in 70 per cent, and that he failed in the remainder of 
his courses. The necessity of reducing these various values to some 
common denominator so as to render them comparable is evident. 

Perhaps the most natural procedure would have been to obtain 
arithmetical averages of the results of each test and rate the indi- 
vidual performance in terms of its variation from the average. After 
determining a rank order in academic standing it would then have 
been possible to calculate the correlations and intercorrelations 
desired. Such a method is valuable in the examination and stand- 
ardization of tests, but it has little to offer when the interest is 
chiefly centered in the study of the individual rather than the tests, 
and it has a tendency to obscure significant personal variations imder 
a mass of figures. Indeed, it is probable that correlation as a sta- 
tistical method has been carried to extremes in recent psychological 
investigations. When the results of two mental tests show a high 
degree of correlation, it does not necessarily follow that they tap 
two abilities which are mutually dependent, but rather that the 
tests have called the same ability or group of abilities into play. 
Conversely, a lack of significant correlation may show either that 
one of the tests is imreliable or that the results are not dependent on 
some common factor. If college psychological tests are designed to 
call into play the same abilities which function in college grades, 
such tests are useless unless a high degree of correlation with academic 
standing can be demonstrated. On the other hand, the absence of 
such correlation does not show the tests to be devoid of significance, 
but merely that they measure other abilities or factors than are 
predominant in the attainment of grades. Further, if it be admitted 
that individual competency is the algebraic sum of the various 
specific abilities and disabilities, then the ideal series of psychological 
tests — ^which would include a different test for each special abihty — 
would show no significant intercorrelations for individuals at the 
same level of general intelligence. 

The purpose here, therefore, is to present the material in such 
form as best to facilitate the scrutiny of the individual record, rather 



10 

than in the form most convenient for statistical treatment. Hence 
the various results must be rated on some common scale which has 
steps of sufficient number to provide the necessary differentiation 
without introducing a false accuracy. In addition, since many of 
the tests used have not been scientifically standardized, it is im- 
portant to adopt a rating system which will permit the comparison 
of test scores with each other rather than with accepted standards. 

A consideration of the many rating scales which lend them- 
selves to the present purpose shows that the extremes are to be 
found in the percentile and the two-division systems. It is hardly 
necessary to enter into a discussion of the pseudo-accuracy of the 
percentile grade. It is only in the very imusual case that the 
material to be rated can be clearly enough differentiated to give any 
real significance to each of the hundred points on the percentile scale. 
Investigations have shown the wide variation in grades given by 
different instructors to the same examination paper even in the 
field of mathematics where the greatest accuracy might be expected. 
This variation, however, is no greater than that shown in the grades 
given by the same scorer to the same paper at different times. The 
injustice done to the college student who receives a final mark of 
69 per cent in the course which demands 70 per cent as a passing 
grade has been conmiented upon too frequently to require more than 
passing mention in this discussion. Obviously, the refinement of the 
percentile scale is too great for the material here at hand. On the 
contrary, the system which merely distinguishes the "passing" from 
the "not passing" does not provide sufficient differentiation for 
analytic examination of the results of a series of mental tests. 

Popular acceptance would seem to have stamped its seal of 
approval on a five-division rating scale. Cabbages and kings alike 
are usually judged mediocre, good or very good, poor or very poor. 
The great majority of our quantitative expressions are given in 
these terms, and the system seems to provide a sufficient niunber 
of significant levels without introducing the fallacy of too great 
refinement. This psychological justification of the five-point scale, 
as well as other considerations of convenience and facility of compari- 
son, led to its adoption as the most satisfactory method of treating 
the various results and scores herein presented. In accordance with 
this decision, the results of each test given to the two hundred students 
who comprised the class in elementary psychology were arranged m 
rank order and separated into quintiles. While the nature of some 
of the tests has made even such a coarse rating as this quite difficult, 
it is believed that the system adopted is the most practicable that 
could have been devised for the present purpose. Since all grades 



11 

assigned in the School of Arts and Science at the University of Penn- 
sylvania are recorded in terms of a five-point system, an added 
advantage is gained in the comparison of test scores with academic 
success. 

The results tabulated in a later section will therefore not be 
foimd to contain the number of digits for the memory span, the 
number of seconds required for the completion of the cylinder test, 
or the number of problems correctly solved in the general intel- 
ligence examination, but instead the translation of each of these 
scores into a quintile rating. If the performance oi an individual 
places him in the best twenty per cent of the class in a particular 
test, he is given a rating of ^'5", if in the poorest fifth of the group 
of two hxmdred, his quintile grade would be " 1 ". The upper, middle, 
and lower quintiles are represented by "4", "3", and "2", respec- 
tively. By thus evaluating a given performance in terms of the class 
results, it will be found a relatively simple matter to scrutinize the 
ratings for each individual and gain a fairly trustworthy impression 
of his standing in an imselected group of university students, and at 
the same time to note his peculiar mental assets and liabilities. 

Selection of Group. 

Since it is the aim of this investigation to discover individual 
differences in a comparatively homogeneous group of students, it 
seemed advisable to make certain eliminations before undertaking an 
intensive study of test scores and college grades. Of the 220 students 
who registered for Psychology 1 at the beginning of the session of 
1919-20, fifteen withdrew before the work of the semester was really 
under way, reducing the class to an actual enrohnent of 205. Of 
these, 125 were taking the course in the School of Arts and Science, 
the remainder being students in the School of Education. This split 
also gives the approximate ratio of men to women in the class. Dur- 
ing the semester twenty members of the class were dropped because 
of deficiency or received a failure upon the termination of the course 
which excluded them from participation in Psychology 2. Since it 
was deemed advisable to make the completion of both courses one 
of the requisites for inclusion in this study, these twenty students 
were automatically eliminated. In order to obtain homogeneity it 
was also decided not to introduce sex differences but to limit the 
investigation to male students enrolled in the School of Arts and 
Science. Of the 125 men who originally started the course only 113 
were eligible for Psychology 2, and of these only eighty received 
final grades at the end of the second semester. Since one of the 
ratings to be taken into consideration is based on academic standing, 



12 

it was thought best not to include first*year students in the selected 
group, thereby eliminating all who were not able to survive at least 
one year of university work, reducing the variation in age, and at the 
same time making it possible to base the academic rating on college 
grades received during two or more years of class attendance. 

When these eliminations had been made, fifty-one students 
were eligible for inclusion in this study. Of this niunber, one indi- 
vidual over thirty years of age was arbitrarily excluded as not con- 
forming to the normal college age. Of the fifty remaining as subjects 
of this investigation, thirty-three had sophomore standing, twelve 
were rated as jimiors, and five were seniors. The average age for 
the group as of October 1, 1919, was 20.8 years, that of the sopho- 
mores bcdng 20.5 years, and of the juniors and seniors, 21.3 and 21.4 
years respectively. Although the averages in the latter cases are 
not of great significance due to the small size of the groups in question, 
the figures quoted do show that the larger group of fifty is composed 
of students of approximately normal college age. In conclusion, it 
will be well to point out that although only about one-fourth of the . 
total class in psychology is to be included in the study, the selection 
was made on the basis of group qualifications and without regard to 
individual merit, except for the automatic elimination of those mem- 
bers of the class who were excluded for deficiency in scholarship. 

Thb Pbtchological Tests. 

The psychological tests included a general intelligence examina- 
tion, the ''Psychological Examination for College Freshmen and 
High School Seniors", devised by Professor L. L. Thurstone, and the 
following thirteen tests designed to exercise some particular ability 
or group of abilities: (1) Ausfrage (Observation) Test, (2) Taylor 
Number Test, (3) Memory Span for Digits, (4) Memory Span for 
Syllables, (5) Memory Span for Ideas, (6) Description of Formboard, 
(7) Trabue Language Test, (8) Courtis Arithmetic Test, (9) Differ- 
ences and Likenesses Test, (10) Opposites Test, (11) Definitions 
Test, (12) Humpstone Memory Test, (13) Witmer Cylinder Test. 

The tests were given in the order indicated, and, with the excep- 
tion of the Witmer cylinder test, all were given during the first half 
of the academic year, or, in other words, as part of the laboratory 
work in Psychology 1. The cylinder test was given in connection 
with the competency rating toward the close of the second semester, 
and it is the only one of the series which was given as an individual 
and not as a group test, and likewise it alone was given primarily as 
a test and not for its didactic or illustrative value. Of the series 
employed, the memory span for digits, the Trabue sentence com- 



13 

pletion, the Courtis arithmetic and Witmer cylinder tests are all in 
general use and have been carefully standardized. The Ausfrage, 
memory span for syllables and for ideas, description, differences and 
likenesses, opposites, and definitions tests have merely been adapted 
to the present instructional aims, while the Taylor number test and 
the Humpstone memory test are here described for the first time. 

Before undertaking a description of the various tests it will be 
well to note in connection with the scoring that the quintile ratings 
were in each case based on the results of the class of approximately 
two hundred students and not on the relative performance of the 
fifty here to be considered. 

ThursUme Psychological Examination. 

On the afternoon of October 26, 1919, some fifteen hundred first- 
year students in the various undergraduate schools of the University 
of Pennsylvania were given the Thurstone ''Psychological Exam- 
ination for College Freshmen and High School Seniors", the experi- 
ment being conducted by the Department of Admissions in co- 
operation with a number of other colleges and imiversities in the 
state of Pennsylvania. At the same hour, the examination was 
given to approximately 120 students who were then meeting in 
different sections of Psychology 1, with the purpose of comparing 
the scores obtained by this relatively selected group, which included 
no freshmen, with the results of the larger first year group. The 
fifty students who form the basis for this investigation all took the 
examination as members of laboratory classes in psychology. 

Description: The form which was used is known as ''Test IV, 
Edition of September, 1919 — issued by L. L. Thurstone of the 
Carnegie Institute of Technology". The examination consists of 
168 short problems which are to be solved in order. The printed 
directions on the cover of the pamphlet, and the specific nature of 
the instructions for each problem greatly simplify the administration 
of the test. The important timing element, which is a complicating 
factor in such examinations as the Army Alpha and the Otis intelli- 
gence test, is practically eliminated in this case. The directions, 
which are read by the examinee before the beginning of the exami- 
nation, state that thirty minutes will be given in which to solve as 
many problems as possible. The problems are to be taken in order, 
but instructions are also given to skip any which may not be under- 
stood. The task of the examiner, therefore, is merely to call attention 
to the directions after the pamphlets have been distributed, and to 
give the appropriate signals at the beginning and end of the thirty- 
minute period. Although the subject is directed to solve the prob- 



14 

lems in order, the final score is determined solely by the number of 
correct solutions without reference to errors or omissions. 

The 168 problems which compose the examination are arranged 
in what is known as the cycle-omnibus form. In other words, while 
only six different tests are employed, the separate problems which 
go to make up each test appear in rotation instead of being grouped 
together as is more usually the case. The examination may readily 
be analyzed into a number of sets of eight problems each, and in 
each set all of the six types of tests occur in regular order. The 
first two problems in each group form part of a general informaMon 
test, while the next two are a variation of the familiar analogies test, 
and the fifth is a sentence completion test taken from the language 
scales devised by Trabue. The sixth problem in each set is of the 
type known as the syllogism test, and the seventh, referred to by 
Thurstone as the reading test, is a form of the widely-used proverbs 
test. The last problem of the group is an example of the number 
completion test. 

Since eight of the 168 problems are preliminary samples for 
which the correct solution is given, the examination actually con- 
sists of only 160 problems of which forty comprise a test of general 
information, an equal number form an analogies test, while each of 
the other types is represented by twenty problems. The final score 
is therefore weighted in the direction of information and analogies. 

Discussion: It is not the present intention to enter into a 
lengthy criticism of the validity of general intelligence tests. Ever 
since the Binet-Simon scale came into popular use, this question 
has been discussed with varjring degrees of fervor, and the many 
recent additions to the store of group tests, which have appeared as 
an aftermath of the army series, have served to keep the controversy 
before the psychological eye. Even the most conservative intro- 
spectionist must admit that the army tests performed a valuable 
service in the stratification of the National Army, and that satis- 
factory results are being obtained at several of the larger xmiver- 
sities by the admission of students on the basis of group psychological 
examinations in lieu of the traditional entrance requirements. The 
general intelligence test is of estabUshed significance in the differ- 
entiation of the various well-recognized levels of performance. The 
question which must be broached here is whether it is of equal 
significance when applied to individuals at the same general intellec- 
tual level, and particularly whether it discloses any information of 
value relative to the college student. 

It may be contended that the Thurstone examination is designed 
for the elimination of applicants for admission, and that significant 



15 

results are not to be expected when the test is applied to students 
who have not only met the entrance requirements but have success- 
fully completed at least one year of college work, as is the case with 
the present group. Nevertheless, it seems profitable to inquire into 
the particular abilities called into play when the examination is 
submitted to college students. A mere inspection of the series of 
problems quoted above will demonstrate that the correct solutions 
could be given by any person of the intellectual level of the college 
student were imlimited time at his disposal. An exception to this 
statement must be made in the case of a few general information 
questions, which are so designed that no individual would be likely 
to give correct answers to all. Hence, whatever the abilities in- 
volved in the solution of the six different tests of which the exam- 
ination is composed, the score obtained is primarily an index of mental 
alertness or of the rapidity of the reasoning processes and not of 
what is usually termed general intelligence. If the colleges wish to 
admit candidates on the basis of the speed with which a problem can 
be solved and without regard to the proportion of correct solutions, 
then the Thurstone examination should be found very satisfactory. 
Or if experimentation can demonstrate that the rapid thinker is also 
the accurate thinker, this type of test will be equally acceptable. 
In this connection it is interesting to note that a correlation between 
the score on the Thurstone test and the percentage of correct answers 
to the total number attempted shows the xmexpectedly high coeffi- 
cient of +0.74 (Pearson) in the case of fifty results chosen at random. 
This would seem to indicate that accuracy and speed are closely 
related, and must be considered as arguing for the validity of the 
examination. A study of the same fifty cases shows that on the 
average only 85 per cent of the solutions given were correct, the 
syllogism test being the most difficult with 23 per cent incorrect, 
while the greatest accuracy was shown in the analogies, sentence 
completion, and number completion tests, each of which had an 
error of only 10 per cent. 

Although the "cycle-omnibus" type of examination has marked 
advantages, chief among which are simplicity in administration and 
scoring, one important weakness must be noted. Assuming that 
the six tests which compose the examination call into play different 
abilities, it is often desirable to analyze a given score in order to 
determine individual assets or deficiencies. In other words, a low 
score might be due either to a poor performance in all six of the 
tests, or to a particularly deficient result in any one of them, such 
as the general information test. While the score would be the same 
m both eases, its significance would be very different. Most of the 



16 

general intelligence tests are so arranged that the scores for the dif- 
ferent parts of the examination are readily available for comparison. 
In the case of the cycle-omnibus, however, an analysis of the various 
test results is practically impossible in view of the undue expendi- 
ture of time and effort required. 

As in the case of the other tests, the class scores for the Thurstone 
examination were arranged in rank order and quintiled. The rating 
for each individual in the table of results shows the quintile grade 
and not the actual score. A discussion of the results and correlations 
obtained will appear in a later section. 

Ausfrage Test 

Description: This test is a variation of the familiar Ausfrage 
test, differing from it only in that specific questions are asked. In 
the first part of the test a picture was thrown on the screen with the 
aid of a stereopticon and the class allowed to examine it for two 
minutes, the following instructions having previously been given: 
''I am going to throw a picture on the screen. While it is there I 
want you to do nothing but look at it. When I have finished I will 
ask you to answer some questions." Upon the removal of the 
picture, ten questions were asked relative to different objects which 
may or may not have appeared in the picture. 

The second part of the test consisted of a series of ten questions 
based on observation of the imiversity buildings and campus and of 
the city of Philadelphia. In both parts of the test written answers 
were obtained. 

In scoring the results, each correct answer received one point, 
giving a maximum score of twenty points. The class results were 
distributed in rank order and quintile ratings determined. 

Discussian: The ability primarily involved in this test is that 
of observation, which implies attending to something and making 
note of it for a purpose. In this case, the stimulus was visual, and 
therefore visual sensibility and discrimination are essential. It may 
be assumed, however, in connection with this test as well as those 
which follow, that every member of the class was equipped with the 
necessary sensibility and with the psycho-motor apparatus involved 
in the recording of results, and these factors will therefore be dis- 
regarded in discussing the various tests. Analytic concentration 
and distribution of attention play a part in the process of observa- 
tion, as does the factor of associability, which will be discussed at 
some length in connection with memory span. Memory enters but 
little into the first part of the test, since the retention required is of 
brief duration, but it must be considered an important element in 



17 

the second part. While all of the abilities mentioned are involved, 
the test may be regarded as primarily one of observation. 

Tayhr Number Test. 

Description: The test material consists of a sheet of white paper 
Sy^ X 10 inches in size, upon which are distributed in a haphazard 
arrangement the numbers from 1 to 50, inclusive, printed in half-inch 
bold-face black type. One sheet was handed to each student with 
the niunbered side of the paper downward, while the following 
directions were given: ''I am going to give each of you a sheet of 
paper. I want you to let it lie on your desk until I tell you what 
to do with it. When I am ready I shall give three commands, the 
first, 'Ready', the second, 'Turn', and the third, 'Go*. When you 
turn the paper, turn it from the right side over to the left, and in 
the upper left hand comer you will find the number '1'. On the 
paper are the numbers from 1 to 50, not arranged in any regular 
order, but scattered over the sheet. As soon as you have turned 
the paper, place your pencil on number 1. When I say 'Go', draw 
a straight line to number 2, then to 3, and go on in order to each 
nmnber until I say 'Stop'. When I say 'Stop' hold your pencils in 
the air imimediately." 

A time limit of forty seconds was allowed for the test, and the 
results were then scored on the basis of the highest number reached. 
The distribution of class results was made and the quintile ratings 
obtained^ 

Diecussian: In so far as is known, this test was devised by 
Mr. Charles K. Taylor and was first used a number of years ago in 
the Pi^chological Laboratory of the University of Pennsylvania. 
When repeated a number of times the Taylor number test serves as 
an excellent index of trainabiUty, but when only one trial is allowed 
it must be considered a test of alertness or distribution of attention. 
In many ways this test is similar to the more familiar "Cancellation 
Test", but it has the advantage of providing no definite cues to 
exploitation, since great care was taken not to arrange the numbers 
<m the sheet in an orderly manner. In addition, the goal is con- 
stantly changing in this test while it remains constant in the can- 
cellation test, where the aim is to locate some particular letter or 
digit. 

It would seem unlikely that discrimination of form would be a 
factor worthy of consideration in the performance of this test by 
college students, but under the conditions of rapid exploration which 
usually exist this element cannot be overlooked. The test also has 
an important motor phase, and coordination and control of move- 



18 

ment play a rather important part in the result. However, the 
higher scores may be attributed to good distribution of attention 
coupled with methodical exploration. 

Memory Span for Digits. 

Description: The material for this test consbts of twenty series 
of digits, ranging in length from three to twelve digits, and including 
two series of each length. The series used were employed by H. J. 
Humpstone in his standardization of the test, and were so prepared 
that no two digits occur in the natural order or in the reversed order, 
ho two succeeding series begin with the same digit, and no digit is 
repeated except in the series of ten or more. Zero is not used. 

The instructions given were those used by Humpstone (9). 
''This is an experiment. In every experiment it is necessary for 
everyone who takes part to do just what the experiment^ asks. 
Please do just as I ask you to. I am going to say some numbers. 
While I say them I do not want you to do anything except look at 
me and hold your pencils up where I can see them. When I put my 
pencil down you write on your paper the numbers I have said." 
The digits were then pronounced at the rate of one per second, with- 
out rhythm or change of intonation except that on the last one of a 
series the voice was allowed to fall as a signal for reproduction. In 
each case the number of digits in the series was announced before 
the series was given. In scoring the results, the number of digits 
in the longest series correctly reproduced is considered the memory 
span. The quintile ratings are based on the scores thus obtained. 

Discussion: According to Professor Humpstone, ''It has been 
assumed by almost everybody who has written on the test that it 
tests memory. A careful analysis causes us to doubt the validity of 
the assumption. Some imagination is required. The subject must 
have enough imageability to get perceptions of the stimuli .... 
In the same sense memory is involved. The images must be retained 
long enough for reproduction. But this period is so brief that the 
results do not furnish any criterion by which to judge of retentive- 
ness .... Attention is involved also .... The ability to dis- 
tribute the attention well is doubtless an aid in the performance.'' 
He continues, " Perhaps the memory span test comes nearer to testing 
one definite ability than any other test. Whatever other factors or 
abilities enter into the performance of this test it is clear that the 
thing specifically tested is the ability to grasp and associate a number 
of discrete units of perception in a definite order. This is not memory 
as pointed out above. We are using the term associability and 
subsuming it under the general heading imagination. Associability 



19 

refers to the ^number of discrete perceptions associated in a single 
act of attention, and the combination of the associated component 
parts of a single perception \" 

While the memory span test is of great value in the examination 
of the mentally retarded, and it can be said without fear of contra- 
diction that a memory span of four and probably of five is pre- 
requisite to intellectual development, the test loses much of its 
significance when applied to a group of college students. Certainly 
in the case of the higher scores the result has been exaggerated by 
means of grouping, and the factor of planfulness plays an important 
part. The lower memory spans of five and six digits are probably 
of greater significance. 

Memory Span for Syllables. 

Description: The subject-matter of the test consists of sixteen 
sentences ranging in length from ten to fifty syllables. The series 
provides two sentences at eath of the various levels, namely ten, 
twenty, twenty-five, thirty, thirty-five, forty, forty-five, and fifty 
syllables. The sentences were prepared by H. J. Humpstone and 
were all taken from a popular current periodical, so as to obtain 
material of a non-technical character which would be of suitable 
difficulty and complexity for the ordinary adult. In each pair of 
sentences, the first is designed to encoiurage visual imagery, while 
the second is of a more abstract nature and does not lend itself 
readily to any type of sensory imagery. 

In administering the test, the sentences were read aloud with 
natiLral expression, the class having been instructed to reproduce 
each sentence graphically, inmiediately following its presentation. 
The number of syllables in the longest sentence reproduced verbatim 
was considered the memory span for syllables for each individual. 
The scores thus obtained were distributed in the usual manner and 
quintiled. 

Discussion: The test here described is an adaptation of the 
''repeating syllables" test used by Binet and modified by Terman 
(10) in the Stanford revision of the Binet-Simon scale. It was felt 
that the sentences used by Terman in the average adult group were 
not well suited to the college student, and it was also desired to 
extend the series beyond twenty-eight syllables. Results so far 
obtained with the Humpstone sentences show a maximum span of 
forty syllables, a minimum of twenty, with a decided mode at thirty 
syllables. An analysis of a large number of results has shown no 
significant difference in the difficulty of the visual and abstraet 
sentences. 



20 

The test may be said to measure the integrated memory qMii. 
While the factor of associability is probably predominant, the ele- 
ments in this case are not discrete as in the memory span for digits, 
and reproduction calls for a higher degree of intelleetual organization. 
Memory is a more important factor than in the span for digits, since 
the period of retention is somewhat longer, but again it cannot be 
considered the ability primarily tested. Language ability is cer* 
tainly involved but the popular character of the sentences employed 
minimizes its unportance when the test is appUed to coUege students. 
The use of tests of this nature as a measure of proficiency in a foreign 
language is suggested in this connection. The memoiy span for 
syllables must be considered an index of integrabiUty rather than of 
simple associability. 

Memory Span for Ideas. 

Description: The paragraph beginning ''Tests such as we are 
now making'' from the superior adult series of the Stanford revision 
was used as the material for the test. The standard directions were 
given with necessary modification for graphic instead of oral repro« 
duction, as follows: ''I am going to read a little selection of about 
six or eight lines. When I am through I will ask you to write as 
much of it as you can. It doesn't make any difference whether you 
remember the exact words or not, but you must listen carefully so 
that you can write down everything it says." The paragraph was 
then read at a natural rate, following which adequate time was 
allowed for reproduction. 

The results were scored on the basis of the number of ideas 
correctly recorded, the paragraph having been analyzed into sixteen 
discrete ideas. The scores thus obtained were arranged in rank 
order and the quintile ratings determined. 

Discussion: While this test is spoken of by Whipple and others 
as a measure of logical as contrasted with rote memory, Terman calls 
attention to the fact that it is rather a test "of ability to comprehend 
the drift of an abstract passage". It seems more satisfactory, how* 
ever, to regard the memory span for ideas as a natural sequent to 
the spans for digits and syllables. It will readily be granted that 
the college student, who receives most of his mental pabulmn through 
the medium of lectures, can comprehend the drift of such a passage 
as the one here employed. The test must, therefore, be considered 
a measure of the subject's ability to associate in consciousness a 
number of logically related ideas. That this requires a higher level 
of intellectual organization than the verbatim reproduction of a 
sentence, as in the memoiy span for syllables, is hardly open to 



21 

question. The test, then, involves not only the element of associ- 
ability but likewise a high d^ree of understanding and of intellect. 
It would therefore be reasonable to expect this test to be more 
significant when applied to college students than either the memory 
span for digits or for syllables. 

While there may be some disagreement as to what constitutes 
the unit idea which is to be used as the basis for scoring, the method 
employed by Terman is too vague for the present purpose, and it is 
believed that the comparative results obtained by any logical scoring 
system will be significant. 

Description Test 

Description: The Witmer formboard, a modification of that of 
Seguin, was used as the object to be described. The Witmer board 
provides recesses for eleven forms, namely the square, rectangle, 
cross, oval, semicircle, star, equilateral triangle, isosceles triangle, 
hexagon, circle, and diamond. The following instructions were 
given: ''I have here an object. I am not going to give you a name 
for it. You can call it a 'thing' — call it 'X'. I want you to pass 
it around so that each one in the class has an opportunity to examine 
it." A number of formboards were then passed about the class, 
and after six minutes had been allowed for examination they were 
collected and placed on tables in different parts of the laboratory 
where they could easily be seen. Further instructions were then 
given. ''What is it? In answer to that question I want you to 
write a description in such a way that anyone would understand and 
recognise this object. You will be allowed twenty minutes in which 
to write this paper." 

Upon the completion of the twenty-minute period, the written 
descriptions were collected and redistributed to other members of 
the class. The number of "points of description" to be used as a 
basb for scoring of results was then determined in an open discussion. 
The scores, which were later translated into quintile ratings, were 
therefore based on an empirical rather than an arbitrary standard. 

Discussion: The term "description" as used in this test has 
reference, not to a literary form, but to the enimieration of the 
salient characteristics of the object in question. The test is obviously 
related to the Ausf rage test previously discussed, in that observation 
is an important factor. In this case, however, memory plays no 
part, since the object to be described is displayed throughout the 
twenty^ninute period. The test resembles the Aussage test in that 
no qpedfic questions are asked, the score being based instead on the 
number ot points of description noted. The problem must therefore 



22 

be oonsideFed one of analysis, and the ability primarily involved may 
be termed analytic concentration of attention. This ability is con- 
trasted with the distribution or alertness of attention called for in 
the Taylor number test. 

The description test was first used by Binet, who stated that 
individual psychology can be more readily studied through the 
examination of complex rather than simple mental processes. The 
test, in the form of description of pictures, is found in the Binet- 
Simon scale as well as in the Stanford Revision. When applied to 
children, the qualitative aspect of the description, whether mere 
enumeration of points or interpretation, is of more significance than 
the quantitative score used in this case. 

Sentence Completion Teet. 

DeecTvpUon: Language Scale ''K", devised by M. R. Trabue 
(11), was employed in this test. Owing to its wide familiarity, it is 
only necessary to remark in this connection that Scale K consists 
of seven sentences which are arranged in the order of increasing 
difliculty. Certain words in each sentence have been omitted and 
the subject is asked to supply the missing words. The procedure 
standardized by Trabue was adhered to, the following explanation 
being given before the distribution of the forms. 

''This sheet contains some incomplete sentences which form a 
scale. This scale is to measure how carefully and rapidly you can 
think, and especially how good you are in language work. You are 
to write one word on each blank, in each case selecting the word 
which makes the most sensible statement. You may have just five 
minutes in which to sign your name at the top of the page and write 
the words that are missing. The papers will be passed to you with 
the face downward. Do not turn them over until we are ready. 
After the signal is given to start, remember that you are to write 
just one word on each blank and that your score depends on the 
nimiber of perfect sentences you have at the end of five minutes." 

The forms were then distributed and the following additional 
instructions given. ''After you have been working five minutes, I 
shall say, 'The time is up. All stop writing!' You will all please 
stop at once and lay aside your pens (or pencils). Now if you are 
all ready, you may turn your papers, sign your names and fill the 
blanks." 

In scoring the results the method recommended by Trabue was 
followed, a sentence perfectly completed being given two points, 
one point being allowed where the idea was right but the best word 
not supplied, and a score of zero recdved where the completion was 



23 

unsatisfactory or omitted. The total number of points for the test 
was determined and the quintile ratings given. 

Diaciission: Trabue, in discussing his language scales, does not 
attempt an anal3rsis of the abilities involved. He calls attention to 
the fact that the completion test was characterized by Ebbinghaus, 
who first used the method, as a ''real test of intelligence", and that 
other psychologists have classified it as a test of imagination, memory, 
association, and various other ''faculties". Trabue himself is satis- 
fied with the statement that the "ability to complete these sentences 
successfully is very closely related to what is usually called language 
abihty". 

As has been mentioned by Whipple and others, the ability 
called into play by the sentence completion test varies greatly with 
the number and character of the elisions made. If the elisions are 
few and the nature of the context simple, the problem becomes 
merely one of controlled association. When the elisions are more 
numerous the test becomes one of active imagination. An inspection 
of the seven sentences which form Scale K will show that for the 
college student the first three sentences and probably the fourth 
present no imaginative problem, and may be considered comparatively 
simple tests of controlled association. The remaining sentences, 
however, are decidedly more difficult, as evidenced by the fact that 
very few errors were made in the completion of the first four sentences 
while many were recorded in the fifth, sixth and seventh, and these 
must be looked upon as tests of imagination. Nevertheless, language 
ability is of so complex a character, involving as it does various types 
of sensory imagery, memory, and intellectual organization, that the 
use of the term imagination in this connection is little more than 
begging the question. 

Although the abilities involved in the sentence completion test 
are difficult of analysis, the test is of proven significance as an index 
of "general intelligence", and a study of the nature of the errors 
made by a subject is often of diagnostic value. 

Courtis Arithmetic Test. 

■ 

Description: The wide acceptance of the Courtis standard 
tests (12) makes necessary only a brief description here. Series A, 
Form 3, of the Courtis arithmetic test was used. It consists of a 
group of eight separate tests in the fundamental processes of arith- 
metic and their application to problems of varying degrees of diffi- 
culty. The first five tests of the series measure efficiency in copying 
figures, and in simple addition, subtraction, multiplication and 



24 

division, respectively. The sixth test requires judgments of the 
operation to be used in simple one-step problems, and is called by 
Courtis the speed reasoning test. The seventh, or '^fundamentals", 
test provides abstract examples in the four operations, and serves 
as a '' general measure of the ability to add, subtract, multiply and 
divide with whole numbers ''. The eighth test requires judgment of 
the operations to be used, as well as the actual solution of more 
difficult two-step problems. 

The standajxi procedure was closely adhered to in administering 
the tests, one minute being allowed for each of the first six, twelve 
minutes for the seventh, and six minutes for the last test. After 
the results had been scored in the usual manner, the scores for each 
test were treated separately, dass distributions being made and 
quintile ratings assigned. The eight quintile grades for each indi- 
vidual were then averaged and the averages thus obtained were in 
turn put in rank order and quintiled. This final quintile rating 
appears in the tabulation of results in a later section of this 
report. 

Dis€U88um: The Ck)urtis arithmetic tests provide a valuable 
illustration of the efficiency test as contrasted with that of intelli- 
gence. Although this is true to a greater degree of the first five 
tests than of the sixth, seventh and eighth, even these latter must 
be considered tests of efficiency when applied to coU^^ students. 
It may be assumed that every member of such a group has the 
educational background and mathematical ability necessary to solve 
each of the simple problems presented, and the test therefore mea- 
sures the facility with whidi the fimdamental processes can be 
employed. It is not the intention here to attempt to analyze the 
specific abilities involved in arithmetic. It has even been asserted 
that mathematical ability is itself specific, akin, for example, to 
musical ability. Certainly, the factor of intellect cannot be disre- 
garded, and in such a test as this, alertness of attention and motor 
co5rdination are also important. Since the higher curriculum does 
not frequently call for exercise in the simpler mathematical opera- 
tions, it is not surprising to find that the college student often fails 
to meet the standards of the higher elementary grades. This 
fact illustrates clearly the distinction between efficiency and com- 
petency. 

It would be well to note in this connection the service whidi 
the Courtis tests have performed in introducing scientiGc measure- 
ments in the field of education. The tests were designed primarily 
to determine the efficiency of the teacher or of the school system 
aaid not to discover individual competency. 



25 

DifferenceB and Likenesses, 

Description: The tests here referred to are all found in the 
Stanford Revision of the Binet-Simon scale, and include the "differ- 
ences" test from Year VII, the "similarities — two things" test from 
Year VIII, the "similarities — three things" test from Year XII, and 
the "differences between president and king" from Year XTV group. 
The Terman method was closely adhered to in giving the tests except 
that a time element was introduced. One minute was allowed for 
each part of the seven- and eight-year tests, two minutes for the 
twelve-year test, and five minutes for the fourteen-year test. The 
response in each ca^e was written instead of oral. In scoring, one 
point was given for each correct difference or similarity. It was 
necessary, however, to quintile the papers largely on the basis of a 
qualitative judgment of the results, since the tests here described do 
not present a real problem to the college student. 

Discussion: Since the association of ideas with reference to 
differences and similarities constitutes the essential element of the 
higher thought processes, these tests are of great significance when 
applied to children, and were included in this series chiefly for their 
illustrative value. From a genetic point of view, the recognition 
of differences is an earlier development than the appreciation of 
shnilarities, as evidenced by the Terman standardization which places 
them at the seventh and eighth years, respectively. However, 
although similarity in the use of familiar objects should be given at 
the eight-year mental level, it is not until the twelfth year that the 
concept thas become usable to the extent of classing the snake, the 
cow and the sparrow as animals. It is not imtil the adult level has 
practically been reached that the ability to appreciate essential 
differences and likenesses is evident, and this ability may be con- 
sidered a significant index of intellectual development. 

The test in its present form cannot be considered satisfactory 
for college students, and as Terman suggests it would be advantageous 
to develop and standardize a new test designed primarily for use in 
the upper years and at the adult level, and adapted to call into play 
the ability to give essential differences and likenesses. As a test for 
adults the one here used can only be said to exercise the associational 
processes. 

Opposiies Test. 

DescripHon: The difficult opposites foimd in list V, page 79, 
of Whipple's Manual of Mental and Physical Tests (13) was used. 
The directions suggested by Whipple were given, as follows: "Write 
as soon as I say a word as qui<My as you can the word that means 



26 

just the opposite. Opposites fonned by the prefixes 'un' and 'in' 
or by the suffix 'less' are not to be given unless the root of the stim- 
ulus word is changed." The stimulus words were called at five- 
second intervals, and the results scored upon the basis of correct 
opposites determined in open discussion. 

Discuasian: Tests of controlled association, such as the part- 
whole, genusHspecies, and opposites tests, are usually scored on the 
basis of the time required and the accuracy of the response. In the 
present case, however, since printed forms were not used, the time 
element had to be ignored except in so far as the five-second period 
eliminated all associations requiring a greater length of time. In the 
scoring of the test a difficulty was encoimtered in the determination 
of correct or permissible opposites, and in some cases where no 
original opposite could be agreed upon the use of two or even three 
terms was allowed. 

The opposites test has been extensively used by Thomdike, 
Woodworth and Wells, Miss Norsworthy and others. The abilities 
involved vary considerably with the ease or difficulty of the stimulus 
words. If the associations called for are too simple the response 
becomes automatic, while if the stimulus words are very difficult lack 
of familiarity with the terms is likely to interfere with the validity 
of the test. It may safely be stated that every word in the list here 
employed is familiar to college students, and that, with one or two 
exceptions, the associations required were difficult enough to eliminate 
automatic responses. It is therefore reasonable to consider the test 
a measure of the facility and accuracy of controlled association, 
involving a high degree of language ability. 

Definitions Test 

Description: As in the case of the differences and Ukenesses 
test, a series of tests from different age levels of the Stanford Be- 
vision of the Binet-Simon scale were used. The definitions tests 
from Year V and from Year VIII, the definition of abstract terms 
from Year XII, and the differences between abstract terms from the 
average adult series comprise the present test. The Terman method 
was employed except that the definition was written and a time 
element introduced. One minute was allowed for each of the defi- 
nitions in the first three tests and two minutes for each in the fourth 
test. In scoring, the same method was followed as in the case of 
differences and Ukenesses, and the same criticism as to the accuracy 
of the quintile ratings applies here. 

Diecassion: The definitions test differs from those previously 
discussed in that it tests neither intelligence nor efficiency in mental 



27 

processes, but is employed as an index of intellectual development 
as displayed by the number of words at the disposal of the individual. 
Since it may fairly be said that formal education consists in adding 
to the number of usable idea-S3anbols and increasing their distinction, 
the vocabulary test provides a simple and quite trustworthy measure- 
ment of intellectual status. With formal education so important a 
factor in each, it is not surprising to find the high degree of correlation 
noted by Terman between the results of his vocabulary test and 
intelligence quotients determined by the Stanford Revision. 

While the principle involved is the same, the test here employed 
differs from the usual vocabulary test in that only a limited number 
of definitions were called for. The purpose was rather to demon- 
strate the various stages of definition than to actually test the college 
group. Beginning with definition ''by use" at the five-year level, 
the series shows the development of definition ''superior to use" in 
the eighth year. Both of these types have a definite perceptual 
basis, and it is not until the twelfth year that the processes of com- 
parison and generalization make possible the definition of abstract 
terms. In the contrasting of abstract terms, definition is related to 
the recognition of essential differences, discussed in a previous test. 

For the college student even the most difficult test of this series 
can hardly be said to present a real problem, although in some 
cases the contrast is not clearly drawn. While such processes as 
discrimination and classification enter into definition, the test may 
be considered one of intellectual development as displayed in language 
ability. 

Humpstone Memory Test 

Description: The memory test devised by H. J. Humpstone 
consists of twenty sentences, each the statement of some rather 
obscure historic fact connected with the name of some individual or 
nation. These statements are in the form of the following sentence, 
"North America was discovered by Columbus in 1492". The series 
of twenty statements was read aloud to the class three times, care 
being taken to pronounce the proper names and the dates distinctly. 
A general discussion, not connected with the experiment, wa^ then 
entered into and continued for forty minutes. At the expiration of 
that period, the first part of each sentence was read and the members 
of the class asked to record in writing the name and date connected 
with the incident. For example, the experimenter might read 

"North America was discovered by ", the remainder of the 

sentence being supplied by each subject. It should be kept in mind 
that care was tak^i in devising the test to select historical incidents 



28 

of a trivial and thei^ore unfamiliar character. Since each of the 
twenty sentencee required the recall of a name and a date, the results 
were scored on the Imsis of forty points. The scores were distributed 
and quintiled in the usual manner. 

Ptacusaion: Various types of memory tests have been devised 
and employed since Ebbinghaus published his pioneer study in this 
field. Some of these have been open to the criticism that tiiey test 
assodability rather than memory, others that the material is unsatis- 
factory, as in the case of nonsense syllables, and still others that the 
time element involved makes them impractical for use in the class- 
room. The purpose in devising the test here described was to select 
simple material which at the same time would be unfamiliar, and 
would offer sufficient points for scoring to provide the necessary 
differentiation of results. It was further desired to construct a test 
which might be completed in the two-hour laboratory period and still 
give sufficient weight to the factor of retentiveness to nuike the test 
really one of memory. The Humpstone Test seems to fulfil these 
requirements satisfactorily. The three readings of the material 
bring in the element of repetition and give a fair degree of initial 
memorisation. The interval and distraction provided by the forty- 
minute discussion involve sufficient retention to make (lie test sig- 
nificant, and the fact that no perfect scores have been made demon- 
strates that the material chosen is of sufficient difficulty for a college 
group. The method of right associates employed in the recall needs 
no comment because of its general acceptance. The natural division 
of the recalled items into names and dates has shown the latter to 
be more difficult of retention, as might be anticipated. 

It is unnecessary at this point to enter into an analytic study of 
memory. The subject has been so thoroughly treated in standard 
text-books and scientific researches as to require no exposition here. 
It will be sufficient to note that the present test adequately calls 
into play the three abilities which are chiefly concerned in memory, 
namdy, modifiability, retentivity, and recall. 

WUmer Cylinder Test. 

Description: The material here employed is an adaptation of 
the Montessori cylinders, and consists of a circular board containing 
recesses for eighteen cylindrical insets. These insets are arranged in 
three series of seven blocks each, the last cylinder of one series being 
also the first of the next series. In the first series the insets are all of 
equal diameter and vary only in hei^^t, in the second the variation 
is in diameter, the height being constant, while the <^linders of the 
third series vary in both height and diameter. The board, which is 



29 

approximately twelve inches in diameter, contains a central recess 
in which all of the blocks may be placed, the subject then being 
required to replace them as quickly as possible. 

Each member of the elementary dass in psychology was tested 
individually by either Professor Witmer, Professor Twitmyer, or 
Dr. Humpstone, this being the only one of the series which was not 
given as a group test. The student was required to stand before the 
table upon which the cylinder board was placed with all of the insets 
in their proper recesses. His attention was called to the fact that 
the tops of the different blocks were flush with the top of the board. 
The insets were then removed by the experimenter and placed in 
the central receptacle, care being taken to mix the blocks well and 
at the same time to leave the larger cylinders on top. The subject 
was then instructed to return the blocks to their original positions 
as rapidly as possible, and the time required was recorded in seconds. 
Upon the completion of the first trial the cylinders were again removed 
and the time for the second replacement determined. 

The results for each of the two trials were treated separately 
and quintile ratings obtained. In accordance with the method 
standardized by Paschal (14), a final rating was given by quintiling 
the results for the shortest trial. The rating for the first, second, 
and shortest trials all appear in the tabulation of results. 

Ducuasion: While the diagnostic value of the mechanical test 
has long been recognized, the cylinder test is the only one of this 
type which has been included in the present series. The test differs 
from those which have previously been described in not requiring 
any appreciable degree of language ability, and hence can not be 
considered in any sense an index of intellectual level. If intelligence 
be defined as the ability to solve what for the individual is a new 
problem, the test is primarily one of intelligence. This, however, is 
by no means the only ability involved. On the motor side may be 
observed the rate of discharge of energy, coordination, complexity 
of response, and in some cases endurance. The performance like- 
wise displays some degree of analytic and distributed attention, 
observation, understanding, and trainability when more than one 
trial is given. While these are not the only abilities involved, they 
may usually be rated with some accuracy on the basis of the cylinder 
performance. 

As Paschal has pointed out, the test has both a qualitative and 
a quantitative aspect. In the present treatment of results, however, 
only the latter has been considered, since the performance has been 
rated solely on the basis of the number of seconds required for the 
successful replacement of the insets. The qualitative aspect of the 



30 

performance was an important factor in determining the competency 
rating which will be discussed in the following section. In general, 
the quality of the performance, must be considered of more diagnostic 
significance than the bare time element, although it is evident that a 
very rapid replacement can not be made unless the performance is 
qualitatively good, nor is it likely that excessive time will be required 
if a satisfactory method is used. 

While the quintile ratings for the first and second trials have 
been included in the tabulation of results, it is probable that the 
rating for the shortest of the two trials gives the safest index of 
cylinder proficiency. In his standardization of the test Paschal 
adopted the shortest of three trials as his criterion, and the results 
here obtained are therefore not directly comparable with those upon 
which the standardization was based. Even though the shortest 
trial gives the most reliable basis for a single rating, the comparison 
of scores made on the first and second trials is important as an index 
of trainability, and these have therefore been included in the table 
of results. 

Compodie Test Rating. 

In the treatment of results it will be of interest to compare the 
records made in the various tests described above with the score of 
the Thurstone test, the rating on academic standing and that on 
estimated competency. It seems advisable to obtain a composite 
rating on the basis of the results for the series of tests in order to 
facilitate this comparison. Unquestionably, the tests are not all of 
equal value, and some method of weighting should be employed. 
Here, however, an almost imsolvable problem is encountered, for 
any system of weighting the various tests which might be adopted 
would necessarily be arbitrary and based on an a priori judgment. 
Moreover, the significance of the tests varies with the individual 
case, and no one method of weighting would be really satisfactory 
for the whole group. 

With these difficulties in mind, it has been decided to obtain 
a composite rating by taking a simple average of the quintile scores 
on the thirteen tests of the series for each individual. Such an 
average has the advantage of not being colored by personal opinions 
of the value of the different tests, and is probably as significant an 
index as could be devised by any complicated system of weighting. 
This average includes only the rating for the shortest trial with the 
Witmer cylinders. 



31 

Thb Compbtbnct Batino. 

One purpose of this investigation was to detennine what reliance 
may be placed cm the ''snap judgment '' of a trained observer. Is 
it possible to rate the college student with any degree of accuracy on 
the basis of an interview covering no more than five minutes? Can 
the experienced psychologist estimate the ability of an individual 
by noting his appearance and carriage, and by obtaining his reaction 
to a few simple questions and observing his performance with a 
mechanical test? It was with a view to answering such questions 
as these that each member of the first-year class in psychology was 
personally interviewed by either Professor Witmer, Professor Twit- 
myer or Dr. Humpstone, and given a competency rating on the basis 
of five minutes' observation. Each student was required to replace 
the insets of the Witmer cylinder test twice, as described in the 
preceding section. The qualitative aspect of this performance had 
considerable weight in determining the competency rating, and it 
should be xmderstood that while coordination, attention, understand- 
ing, trainability and intelligence are all reflected in the time scores 
of the two cylinder trials, the latter do not necessarily correlate with 
a rating based on the quality of the performance. As has been 
previously noted, the cylinder test is the only one considered in this 
study which was given individually. 

The rating, however, was not based solely on the performance 
with the cylinders. As the student presented himself to the exami- 
ner, he was asked to write his name upon a record card, and the 
character of his writing as well as the degree of composure dis- 
played were observed. A few leading questions were then asked 
regarding preparatory school, purpose in coming to the University, 
intended vocation, outside activities, and the like. No attempt 
was made to ask the same questions of each individual, but rather 
to cany on a short conversation which varied naturally with the 
replies given. The subject was then given the cylinder test, follow- 
ing the procedure previously outlined, and after answering one or 
two questions as to his work in psychology was dismissed. As a 
rule the whole interview consumed no more than five minutes. 

While all three of the examiners had come into some contact 
with members of the class through lecture work, no one of them knew 
the students personally or had had occasion to be familiar with the 
type of work done by any individual. The rating was therefore based 
entirely upon an observation of the student's behavior as displayed 
in his general bearing and address, his answers to the questions, and 
his performance with the cylinders. In this respect, the competency 
rating here employed differs from the rating on estin^ted intelli- 



32 

genoe which has frequently been used in connection with investi- 
gations of this character. Such a rating has usually been given by 
an instructor familiar with the student and with his work in the 
classroom, or by averaging the estimates made by a number of in- 
structors so qualified. The competency rating is therefore not 
directly comparable with the ratings on estimated intelligence 
referred to in a preceding section. 

In giving these ratings, the five-point scale was used in a some- 
what modified form. Each of the five points of the scale was sub- 
divided into five lesser grades, thus giving a maximum rating of 5.5, 
a minimimi of 1.1, and a mediocre grade of 3.3. When each student 
had been rated on this scale, the three examiners in conference 
arranged the members of the class in rank order on the basis of es- 
timated competency. Since it is felt that individual differences in 
the standards of the three examiners somewhat reduces the sig- 
nificance of the actual rating assigned, the rank order has been 
employed in determining a quintile rating on estimated competency, 
which appears in the tabulation of results. This treatment has the 
added advantage of making the ratmg directly comparable with the 
quintile scores of the various mental tests. 

It will be well to note at this point that there is no objective 
standard by which to measure the accuracy of the competency ratings. 
In estimating the ability of the student, the attempt was not made 
to predict the degree of his success in the study of psychology, nor 
is the rating a prognosis of his relative academic standing as deter- 
mined by the grades received m all coUege couibcs. Neither can the 
accuracy of the judgment be measured by his performance in any one 
or in any group of psychological tests. The term '^ competency 
rating'', implying the algebraic sum of the individual's specific 
abiUties and disabiUties as demonstrated by his success as a member 
of human society, best interprets the character of the rating imder 
discussion. In this connection it may be stated that no ratings 
lower than 2.3 were given, or, in other words, no students were found 
so deficient in general competency as to fall below the '' doubtful" 
group. In view of the fact that the members of the class had imder- 
gone a strenuous process of selection in fulfilling the entrance require- 
ments and surviving at least two years of college work, it is not 
surprising to find a complete absence of '' 1 " ratings. This fact does 
not appear in the tabulation of results where the ratings have been 
quintiled on the basis of rank order, and only the quintile grade shown. 

Although, as has been pointed out, the competency rating can 
not be checked by comparison with mental test scores or academic 
record, it will nevertheless be profitable to determine in the later 



33 

treatment of results whether the rating shows any significant degree 
of correlation with competency as displayed in the tests and college 
grades. 

Academic Rating. 

Popular tradition has it that the youth whose scholastic attain- 
ments make him valedictorian of his college class is destined for 
future mediocrity, while the typical campus loimger whose academic 
life is cut short by a heartless faculty is sure to make his mark in the 
world of success. Nevertheless it will hardly be denied that pro- 
ficiency in the classroom is to some degree indicative of individual 
competency, and it will therefore be desirable to know something 
of the relative academic standing of the fifty students imder 
consideration. 

While it might be contended that preparatory school records 
would be significant in determining a rating on scholastic merit, the 
great variation in standards and the incomparability of the various 
grading S3rstems employed make it advisable to reject this suggestion 
without further deUberation. Moreover, since grades for at least 
two years ©f college work are available for each member of the group, 
it seems unnecessary to base the academic rating on any work other 
than that done at the University of Pennsylvania. 

As has previously been stated, a five-point S3rstem of grading is 
employed in the School of Arts and Science. This scheme provides 
three symbols for work of passing grade, while two are reserved for 
that of an unsatisfactory character. To be more specific, the letters 
"D", ''G", "P", ^'N", and "F'' are assigned, signifying Distm- 
guished. Good, Passed, Not Passed (conditioned), and Failure, 
respectively. A student receiving a grade of "N" in a course may 
relieve himself of the condition by passing a re-examination, while 
the grade "F" necessitates the repetition of the course. As applied 
to the courses in psychology, an "N" in Psychology 1 permits the 
student to continue his work in Psychology 2, but this permission is 
not given when "F" is received in the first course. It will be noted, 
therefore, that no member of the present group received a grade of 
"F" in Psychology 1, since each of the fifty students completed both 
courses in the academic year 1919-20. 

While it must be imderstood that the letter system of grading 
is intended to obviate the pseudo-accuracy of the percentile grade, 
and that it is not possible to assign percentile equivalents for the 
symbols used, the necessity for obtaining some kind of composite 
rating as an index of academic standing is evident. For example, a 
given student may have received a grade of "D" in five coiirses. 



34 

"Q" in eight, "P" in four, and "N" in two. Moreover, each course 
may have required from one to nine hours of class attendance per 
week, with a value of from one to four imits of credit, a unit being 
the equivalent of one hour of lecture work or two hours of laboratory 
work per week for the academic year. Hence it is clear that the 
grades must be considered in terms of units of credit rather than by 
courses if a significant rating is to be obtained, and also that some 
numerical translation of the letter grades must be devised. 

Since the percentile scale is not recognized in the University 
marking system, any numerical equivalents which might be adopted 
would necessarily be arbitrary. Roughly, it may be said that '^D^' 
represents a range from 90 \o 100 per cent, '^G" from 80 to 90, and 
''P'' from 70 to 80. There seems to be no justification, however, 
for selecting 70 per cent as the marking mark, nor would it be more 
accurate to place it at 60 per cent. A satisfactory evaluation of the 
''N" and ''F'' is even more confusing. While the passing grades 
might be valued at 95, 85, and 70, respectively, it would be difficult 
to decide whether the ''F", which ranges from zero to 50 per cent 
should be rated as 25 or 45. By far the simplest solution to the 
problem, and what seems to be the most logical, is to adopt here 
the five-point scale generally employed in this study. It is quite 
as reasonable to represent the five-letter grades by the numbers 5, 
4, 3, 2, and 1, as by any other numerical values which might be 
suggested, and this method has the advantage of permitting a direct 
comparison between the composite rating for college grades and that 
for mental tests. It has been determined, moreover, that the rank 
order remains approximately the same whether this system is used 
or the values 95, 85, 70, 55, and 45 be given to the letter grades. 

The academic rating has therefore been determined by multiply- 
ing the number of units assigned each letter grade by the appro- 
priate digit, and dividing the sum of these products by the total 
nimiber of units graded. The student who had received no grades 
lower than "D" would have a rating of 5.0, while a record with an 
equal number of "G" and "D" units would average 4.5. Since it 
would be almost impossible for a student to remain in college who 
had not averaged the passing grade, it is not surprising to find that 
only one of the fifty has an average below 3.0, lus rating being 2.9. 

Perhaps even a rating of this kind implies an accuracy of mea- 
surement which cannot be justified. If every ''D" assigned as a 
final grade stands for the same level of excellence, and if the same 
amount of work is required for a passing grade in every course, then 
the validity of the average rating cannot be questioned. If, however, 
one department of the college is f oimd to be giving the highest grade 



35 

to 25 per cent of its students, while a second allows only 5 per cent 
of ''D"By then the impossibility of comparing grades assigned by 
different departments is evident. Moreover, it has been demon- 
strated that different instructors in the same department vary greatly 
in the grades which they assign to a given piece of work, and that 
this variation is no greater than that which will be shown by one 
instructor marking the same work at different times. It is indeed 
questionable whether any reliance should be placed in a comparison 
of college grades in an institution where the majority of the courses 
are elective, and where there is no general supervision of grading. 

The grading problem is by no means a new one, and has a con- 
siderable literature of its own. Finkelstein (15), for example, has 
published an interesting study of conditions at Cornell University in 
which he demonstrates the need for supervision of the grades assigned 
by different departments by showing that some instructors are typi- 
cally low markers. He makes a plea for the adoption of a five-division 
Q3rstem of grading with the provision that the grades given by any 
instructor shall not deviate in the Ibng run from a distribution agreed 
upon. While the intention of the present study is not to preach the 
necessity of some such general supervision of grading at the University 
of Pennsylvania, the existing absence of uniformity demands com- 
ment. Under the present curriculum, a student in the School of Arts 
and Science is required to complete a specified number of imits of work 
in each ''group" of subjects. In most cases he is free to elect which 
courses he will pursue in a given group. For example, six imits of 
credit is required in the Biological Science Group which is composed 
of courses in botany, zoology and psychology, but the decision as to 
whether all six imits be taken in one subject or be distributed between 
two is left entirely to the student, as well as the choice of the subject 
or subjects to be elected. Until recently the elementary work in one 
of the three subjects has been so much less difficult than that in the 
other two, that the situation has been fully recognized by the under- 
graduate, with a consequent influx to the easier course. While this 
condition has been remedied in the case cited, it doubtless still exists 
in other groups, and the present plea is made rather with the purpose 
of calling attention to the lack of general supervision of grade dis- 
tributions than as a criticism of any particular instance of non-con- 
formity. Although the necessity of some general supervision of all 
grades assigned in the college cannot be overlooked, the more pressing 
need of imiformity within the various groups must be emphasized. 

From the foregoing discussion it is evident that grades assigned 
by various instructors in different departments of the University are 
not really comparable, and it is with this understanding that the 



36 

academic record will be included in the present investigation. Even 
though the data cannot be considered scientifically accurate, however, 
it must be admitted that the student's college grades do give some 
indication of his scholastic ability. The grades alone determine 
whether he is to receive academic honors or be dropped by the 
Executive Committee for general deficiency, as well playing an 
important part in election to Phi Beta Kappa and in placement after 
graduation. 

In the tabulation of results, the final grades for the two courses 
in psychology will be noted in addition to the average rating for all 
college grades including those in psychology. The latter are given 
separately since it is felt that the xmusual opportunity for personal 
contact between instructor and student in the elementary courses in 
this department makes these grades somewhat more significant than 
is generally the case. 

In conclusion, it seems almost imnecessary to point to the fact 
that similar grades may not mean the same thing when assigned to 
different students even in the same course. Although the attempt 
is made to control the amount of work done by fixing the TnaYinnmn 
as well as the minimmn niunber of units which may be taken by a 
student in a semester, some carry so full a roster as to seriously 
interfere with the display of actual ability, while others who are not 
experiencing great success with a comparatively light schedule may 
be handicapped by outside work which they are pursuing as a means 
of livelihood. Since the evaluation of these distributing factors is 
well nigh impossible, they must be ignored in the present treatment 
of college grades. 

Tabxtlation of Results. 

While it was intended to make a statistical study of the various 
scores and ratings which form a basis for this investigation, the 
primary purpose was to study the individual record rather than the 
mass results. It has therefore been deemed advisable to present a 
complete tabulation of the ratings for each member of the group, 
and thereby facilitate the scrutiny of the individual case. In the 
following table will be found (1) the number used to designate each 
student in the group, (2) his class, whether sophomore, junior, or 
senior, (3) the quintile rating for the Thurstone test, (4) the quintile 
rating for each of the thirteen mental tests with the addition of the 
ratings for the first and second trials with the cylinders, (5) the com- 
posite test rating obtained by averaging the ratings for the thirteen 
separate tests — ^this average does not include the Thurstone test and 
only the shortest trial with the cylinders is included, (6) the quintile 



Tabulation of Rbsui/tb. 



3 


3 


3 




2 


4 




4 


4 


2 


1 


2 


2.9 


3 


4.5 


G 


4 


6 


4 




5 


3 




3 




4 


4 




3.9 




4.7 


D 


3 


2 


2 




2 


3 




2 




4 


5 




2.6 




3.4 


N 


3 


5 


2 




5 


3 




5 




1 


2 




3.3 




3.6 


G 


3 


1 


3 




6 


3 




1 




6 


2 




3.0 




3.9 


P 


3 


2 


- 




2 


3 




2 




1 


I 




3.3 




3.8 


P 




S 


3 




4 


3 




4 




5 


5 




3.1 




3.2 


N 




3 


2 




2 


3 




1 




2 


2 




2.8 




3.6 


P 




3 


4 




2 


3 


3 


3 




5 


5 




3.0 




41 


G 




4 


3 




2 


1 


2 


2 




3 


4 




2.5 




3.5 


P 




2 


4 




5 


3 


3 


6 




4 


1 




3.fi 




4.2 


P 




3 


3 




2 


3 


3 


2 




3 


2 




3.0 




4.1 


P 




4 


3 




4 


2 


_ 


2 




5 


5 




3.5 




4.5 


G 




3 


3 




4 


3 


3 


3 




3 


1 




3.7 




4.2 


G 




S 


3 




2 


3 


3 


2 




S 


4 




3.5 




3.5 


P 




3 


3 




4 


3 


3 


2 




3 


3 




2.8 




3.7 


P 




2 


5 




1 


3 


1 


4 




1 


3 




2.3 




3.7 


P 




5 


3 




5 


3 


3 


2 




3 


3 




3.3 




4.6 


P 




6 


3 




2 


3 


3 


5 




1 


3 




3.5 




3.9 


P 




3 


3 




1 


3 


3 


5 




I 


2 




3.0 




3.4 


P 


3 


4 


3 




5 


3 


3 


2 




4 


2 




3.5 




4.2 


P 


3 


3 


1 




3 


3 


4 


4 




i 


1 




3.0 




3.5 


P 


3 


3 


4 




5 


3 


3 


3 




3 


4 




3.2 




3.7 


P 


3 


4 


3 




3 


3 


3 


2 




5 


2 




3.7 




3.1 


P 


2 


3 


3 




4 


- 


3 


2 




3 


4 


3 


3.2 


3 


33 


N 


4 


4 


3 




3 


3 


2 


1 




6 


3 


5 


2.9 


3 


4.0 


P 


4 


5 


3 




3 


3 


4 


2 




3 


4 


3 


3.8 


3 


4.4 


D 


3 


_ 


3 


2 


5 


4 


5 


4 




2 


2 


2 


3.8 


3 


4.7 


D 


3 


6 


3 


2 


5 


3 


3 


2 




4 


5 


4 


3.4 




4,0 


D 


3 


4 


_ 


3 


3 


3 


4 


I 


5 


B 


3 


5 


3.6 




3.7 


P 


4 


3 


1 




2 


2 


3 


_ 


3 


4 


3 


4 


3.0 




4.0 


G 


4 


3 


3 




3 


1 


3 


2 


6 


1 


3 


2 


2.5 




4.6 


D 


1 


3 


3 




4 


2 


2 


1 


5 


3 


1 


3 


2.5 


•1 


3.1 


P 


3 


3 


- 




4 


3 


3 


3 


5 


3 


3 


4 


3.3 


4 


4.3 


D 


3 


3 


3 




4 


3 


3 


4 


6 


6 


5 


5 


3.9 




3.7 


G 


4 


_ 


_ 




4 


_ 


_ 


2 


- 


2 


3 


2 


3.6 




3.0 


G 


3 


3 


2 




4 


3 


- 


2 


2 


5 


4 


5 


3.3 




3.4 


P 


3 


2 


2 




3 


3 


3 


4 


3 


4 


4 


4 


2.8 




3.0 


P 


3 


3 


1 




1 


2 


3 


3 


2 


4 


5 


4 


2.7 




3.1 


P 


4 




3 




4 


2 


3 


4 


5 


1 


1 


1 


3.0 




3.9 


G 


2 




3 




5 


3 


4 


4 


2 


5 


5 


1 


3.2 




3.6 


G 


3 




e 


s 


2 


3 


2 


4 


4 


5 


6 


5 


3.6 




4.5 


G 


3 




3 




2 


3 


3 


5 


3 


I 


1 


1 


3.1 




3.2 


P 


4 




3 




3 


3 


- 


4 


4 


3 


B 


3 


3.6 




4.2 


G 


2 




2 




3 


3 


1 


4 


3 


I 


2 


1 


2.4 


2 


4.1 


P 


3 




3 




fi 


4 


4 


4 


_ 


6 


5 


5 


3.6 


5 


2.9 


P 


3 




3 




6 


4 


2 


- 


- 


1 


1 


1 


3.0 


2 


3.6 


N 


4 




_ 




1 


6 


3 


S 


4 


5 


5 


5 


3.7 


6 


3.4 


G 


5 




1 




2 


2 


3 


3 


4 


2 


4 


1 


2.6 


2 


3.7 


P 


4 


3 


1 




1 


3 


2 


2 


fi 


3 


1 


3 


2.4 


3 


3.9 






38 

rating based on estimated competency, (7) the final grades in Psy- 
chology 1 and Psychology 2, (8) the academic rating obtained by 
averagmg coUege grades as previously described. 

In studying the tabulation of results it must be borne in mind 
that in every case the quintile rating was obtained from the dis- 
tribution of the results of the class of approximately two hundred 
students, and not merely on the basis of the fifty here included. 
This explains the fact that the ratings are not equally divided among 
the five quintOes. 

Discussion of Results. 

In considering the data tabulated on the preceding page, it will 
first be of interest to determine whether any significant correlations 
exist between the various ratings given for the group as a whole, and 
then to study the results for the individual student. It will be 
valuable to ascertain, for example, whether the rating for the Thur- 
stone test correlates with the average score for the series of more 
specialized mental tests. Since general intelligence may be looked 
upon as an average of the specific abilities of the individual, a high 
correlation might well be expected between these two ratings. Each 
of these, in turn, must be compared with the rating on estimated 
competency, and it will likewise be profitable to observe whether 
any one of these three ratings may be considered an index of pro- 
ficiency in college work. 

With this purpose m view a series of intercorrelations has been 
calculated between the ratings assigned for the four general divisions 
of the results. In each case the coefficient of correlation was ob- 
tained by the Pearson method. The data employed consists of the 
quintile grade on the Thurstone test, the average rating for the 
thirteen mental tests, the quintile rating on estimated competency, 
and the average rating for college grades. 

Correlations, 

Competency rating with mental tests r = +0.49 

Thurstone test with mental tests r = +0.40 

Thurstone test with college grades r = +0.39 

Thurstone test with competency rating — r = +0.36 

College grades with mental tests r = +0.21 

College grades with competency rating r = +0.10 

A mere inspection of the coefficients listed above will show that 
while all of the correlations are positive, not one can be considered 



39 

significant. In general, it may be stated that coefficients between 
+0.30 and +0.75 show that the same factors are operative in the 
two series to some degree, but the correlation can hardly be regarded 
as significant unless a coefficient greater than +0.75 is foimd. An 
immediate conclusion can therefore be drawn either to the effect 
that the values employed are not to be relied upon, or that the per- 
formances rated in the four cases did not involve the same factors 
or abilities. Nevertheless, it will be of interest to scrutinize the 
coefficients obtained more closely, and to attempt to interpret 
them. 

The highest correlation of the series is found to exist between 
the rating for estimated competency and that for mental tests. This 
is not surprising since the competency rating was given largely on 
the basis of the performance displayed in the solution of one of these 
tests. In view of the fact that the cylinder test calls into play so 
many of the abilities which enter into other tests of the series, it is 
rather surprising that the correlation did not prove greater. This 
can probably be accoimted for by the fact that the cylinder test does 
not involve language ability, which is an important factor in prac- 
tically all of the other tests. 

Next in order is f oimd the correlation between the Thiu^tone 
test and the mental test rating. As has been pointed out, both of 
these ratings may in a sense be considered indices of general intelli- 
gence, and since many tests in the series involve intellectual pro- 
cesses similar to those called for in the Thurstone examination, the 
low correlation displayed here is again unexpected. However, the 
weight given to the time element in the latter test is so great, and 
the range of abilities involved so much more restricted than in the 
Pennsylvania series, that it is not difficult to account for the seeming 
inconsistency of the results. 

The very low correlations obtained between the academic rating 
on the one hand and the mental test and competency ratings on the 
other, provide food for serious reflection. The question which must 
naturally arise is whether academic proficiency, as it is evaluated in 
our colleges today, is really an index of the competency of the student. 
Perhaps it will be well to notice whether the low correlation shown 
here is typical of other similar investigations. In the report by 
Caldwell (7) previously referred to, appears a siunmary of the results 
obtained by other experimenters showing correlations obtained be* 
tween various series of mental tests and college grades. In this con- 
nection, it is imnecessary to note in detail the character of the tests 
used by each of the investigators, and merely a statement of the 
correlations obtained, as cited by Caldwell, is i^own below. 



40 

Correlation of Test ReauUa with College Grades. 

Wissler 0.09 

Calfee 0.23 

Rowland and Lowden 0.37 

Waugh 0.41 

Kitson 0.44 

King and McCrory 0.39 

CaldweU 0.44 

While the correlations above are in most cases greater than that 
obtained in the present investigation, namely 0.21, it will be noted 
that not in a single instance was a significant coefficient shown. 
Rogers (8) does not even attempt to calculate a coefficient of corre- 
lation between test results and college grades, but states that "to 
predict an individual's probable status in academic work from his 
performance in the tests would obviously be rash ". As has previously 
been stated, a comparison of the competency rating with ratings on 
estimated intelligence cited in other investigations is hardly possible, 
since m this ca^ the estimate was made by an individual unfamiliar 
with the students rated. It is well to note, however, that even where 
inteUigence was graded by instructora weU acquamted with their 
students, correlations with college grades have not exceeded 0.60. 

From the facts given above it is possible to arrive at three con- 
clusions. In the first place, college grades may not actually reflect 
the mentaUty of the student, or secondly, the tests employed are 
inadequate or misleading, or finally, the factors which enter into the 
assignment of college grades are not the same as those which are 
measured in psychological tests. Probably all three of these con- 
clusions are in some degree justified. 

Voice has been given recently to much criticism of the present 
imiversity curricula on the groimds of impracticality and because of 
the continuation of secondary school pedagogical methods in insti- 
tutions of higher learning. On the other hand, a large proportion of 
the instruction in our colleges today is given by means of lectures. 
The grade assigned at the end of the course is often determined 
chiefly by the student's ability to give back on an examination paper 
certain information which has been fed to him in lectures during the 
term. Frequently, little intelligence is called for and the student is 
rated either on the excellence of his memory or on the degree of 
industry with which he compensates for a deficiency in that ability. 
When to this criticism of imiversity instruction is added the unreli- 
ability of the grades themselves, as discussed more fully in an earlier 
section, it is evident that the low correlation between college grades 



41 

and test results may be in part due to shortcomings of the educational 
system both as regards methods of instruction and grading. 

In scrutinizing psychological tests as a whole or the series em- 
ployed here in particular, certain criticisms must be made. Perhaps 
the exaggeration of the importance of the time element is the most 
serious fault with the majority of mental tests. Intellectual dex- 
terity is generally measured rather than organization and usability 
of knowledge. The difficulty is increased in this case by the homo- 
geneity of the group tested. Many of the tests would be significant 
when applied to individuals less carefully selected and at a lower 
level of mental development. In most cases the problem presented 
is too easy to tax the college student, and the speed of reaction is 
the only ability measured. Another criticism which may be made of 
tests in general, is that they do not measure with sufficient accuracy 
the abUities wMch they a^ designed to gauge. In other words, a 
subject does not always give the same score on the same or equivalent 
tests due to variations in attention, interest, physical condition, etc. 
Mental testing will not be scientifically accurate imtil the technique 
has been so refined as to greatly reduce the probable error of the 
score, or imtil a higher reliability coefficient can be obtained. The 
low correlation between college grades and mental tests may, then, 
be due to shortcomings of the latter as well as to inaccuracies of the 
former. 

It seems reasonable, however, to believe that this lack of corre- 
spondence can be attributed largely to the fact that college work 
involves other factors than those measured by any series of psycho- 
logical tests which has yet been devised. In addition to the mental 
abilities which go to make up the competenc3r of the individual, the 
factor of motivation plays a most important rdle in academic success. 
It is possible to conceive of two students of approximately equal 
competency, one of whom is inspired by the desire to excel in intellec- 
tual pursuits, while the other is in college for the purpose of enjoying 
social or athletic advantages. The intense interest and industry of 
the first is likely to result in a higher academic rating than would 
be predicted from his performance in series of mental tests, while 
just the opposite is true in the case of the second student. While 
it is fair to believe, therefore, that psychological tests can be em- 
ployed to select those students who have the ability to succeed in 
college, they will not form an adequate basis upon which to predict 
academic success imtil some means has been devised of measuring 
motive in quantitative terms. The final solution of the problem 
will be reached when more accurate methods of assigning college 
grades have been adopted, and those grades depend more on the 



42 

higher thought prooesses and less on memory, and when, on the 
other hand, psychological tests have been made more difficult, place 
less stress on the time element, and include some index of motivation. 

Althought it must be admitted that the formulation of a series 
of mental tests which will accurately predict success in college work 
is desirable, no great benefit would accrue thereby either to the 
science of psychology or to the field of education. The psychologist 
is not so much interested in the abilities which determine college 
grades, as in evaluating the particular mental assets and liabilities 
which characterize the individual. While the general intelligence 
rating, which represents the summation of the scores in a niunber 
of tests, is doubtless of some significance, the analysis of such a 
rating so as to show the peculiar abilities and disabilities of the 
individual is of much greater importance from the point of view of 
psychology. An inspection of the results shown in the preceding 
tabulation reveals the fact that although two students may have 
the same average test rating, the scores obtained in the different 
tests are not really reflected in this average. Of two individuals 
who had an average rating of 3.3 in the thirteen tests and who 
received the same quintile rating on the Thurstone test, one was 
placed in the third quintile in nine of the tests, the other in only 
three. Obviously the first student showed consistent mediocre 
ability, while the second displayed considerable variation in the 
different tests, having four ratings of " 5 ", three of " 2 ", and one " 1 ". 
There is no doubt that the latter student provides the more interesting 
material for psychological study and for vocational guidance. 

Since it is believed that the present investigation is significant 
rather in the analysis of individual competency than in the correla- 
tion of group results, it will be the purpose in the following section 
to scrutinize the record of each member of the group and to deter- 
mine whether any conclusions of value in diagnosis or guidauce can 
be reached. In considering the academic rating it is well to note 
that ratings higher than 4.0 are very good, while those below 3.5 are 
poor. The median academic rating for the group is 3.7. Composite 
test ratings above 2.9 and below 3.5 are considered mediocre, with the 
median rating at 3.2. 

Analysis of Individtjal Recobbs. 

No. 1. 

This student shows a consistently mediocre record until his 

college grades are observed, when he is foimd to have one of the 

highest academic ratings of the group. Placed in the middle quintile 

in the Thurstone test as well as in estimated competency, his average 



43 

test rating is well below the median. As for the separate tests, he 
has received the highest rating in none, and the lowest only in the 
memoiy span for digits. In general, the higher scores are exhibited 
in those tests which involve language ability and memory, and the 
lower where these factors are not prominent, namely, in the Taylor 
niunber test, the Courtis test, and the cylinders. In view of the 
high grades in psychology and the high academic rating it seems 
probable that this student has some strong motive, such as ambition, 
and supplements a mediocre intellect with an imusual amoimt of 
industry. 

No. 2. 

This record shows the most consistently high rating to be found 
in the group. The academic record is the highest, and this is borne 
out by ''distinguished'' grades in both courses in psychology. The 
ratings for estimated competency and fcH- the Thimstone test are 
both in the fifth quintile, while the average test rating is equaled 
by only one other student in the group. In considering the results 
of the particular tests, it will be observed that this student has not 
fallen below the middle quintile, but has reached the highest in only 
three tests. He shows the poorer scores in those tests which stress 
language ability and memory, and the higher ratings where intel- 
ligence, imagination and attention are involved. The general level 
of performance is so high as to make any specific recommendation 
or prognosis unsafe. 

No. 3. 

The chief point of interest in this case is the lack of correspond- 
ence between the competency rating and the remainder ci the data 
at hand. This student shows an academic rating which places him 
in the poorest fifth of the group, with conditions for both courses in 
psychology. Although in the second quintile in the Thurstone test, 
his average test rating is one of the lowest recorded. He rates above 
the middle quintile only in the sentence completion and cylinder 
tests. This may indicate good intelligence not directed toward 
college work, but the conclusion that the competency rating is too 
high in this case seems justified. 

No. 4. 
The indication here is of a student of somewhat more than 
average general intelligence whose record is largely influenced hy 
interest in the task at hand. With an academic rating slightly 
above the average and a "Q" and "P" in psychology, his score in the 
Thurstone test puts him in the highest quintile. The composite test 
rating is slightly above the average, and shows a preponderance of 



44 

''5's" as well as a number of "2's" and a " 1 ". High ratings in the 
memory span for ideas, the sentence completion and the definitions 
tests, as contrasted with a very poor cylinder perfonnanoe, indicate 
intellectual ability rather than intelligence. 

No. 6. 
This record shows a student somewhat below the average in 
competency, with an academic rating slightly better than would be 
expected from the test results. Passing grades in both courses in 
psychology, estimated competency in the second quintile, and the 
median rating for the Thurstone test all indicate mediocre ability. 
This is borne out by an average test rating below the mean for the 
group. Performances in the memory spans for digits and ideas, and 
in the definitions and memory tests were rated in the lowest quintile. 
High scores were obtained in the Ausfrage and Courtis tests and in 
the second trial with the cylinders. Although the test results show 
great variation, there seems to be no definite tendency displayed. 

No. 6. 
This individual probably possesses mediocre ability, although 
receiving a very low competency rating and a very high score on 
the Thurstone test. A fair academic rating with passing work in 
psychology, and a test rating slightly below the average seem to 
indicate that neither the Thurstone test nor the competency rating 
gives a true picture of the student. High scores in the Ausfrage, 
digit span, Trabue, opposites and memory tests, with very poor 
cylinder performances, would suggest fair intellect coupled with 
rather deficient intelligence. 

No. 7. 
The record here indicates relatively low competency with a 
high degree of native intelligence. A very poor academic rating is 
substantiated by a condition and a failure in the two courses in 
psychology, and a low average test rating. An exceptionally good 
performance with the cylinders and a high rating on the Thurstone 
test and the idea span, with lower scores on the test requiring language 
ability and memory, lead to the conclusion that this man is misplaced 
in college, but would probably succeed in a pursuit which does not 
stress intellectual development. 

No. 8. 

There are no outstanding features in the record of this student. 

The academic rating and the Thurstone score are both slightly 

above the average, while the test rating is somewhat below. The 

tests which emphasize the intellectual side usually show good scores. 



45 

while those which do not depend on language ability, such as the 
Taylor number, the Courtis, and the cylinder testa, are placed in the 
lower quintiles. On the whole, the record is mediocre. 

No. 9. 
In this instance, a high academic rating, good work in psychology, 
a high competency rating and a good score on the Thurstone test 
fail to correlate with a rather low test rating. Median scores on seven 
of the tests, with only one result in the highest and one in the lowest 
quintile, indicate a rather consistent mediocrity. A high rating 
in the memory test and an excellent cylinder performance suggest 
that good memory and intelligence are responsible for the high 
academic standing. 

No. 10. 
A competency rating of "4" indicates that this man was not 
doing his best on the mental tests. Mediocre college work and a 
low rating on the Thurstone test suggest that the competency rating 
is too hi^. The test scores are generally low where language ability 
is involved, and are above the average for the Taylor, idea span, 
memory and cylinder tests. As in Case 7, it seems likely that this 
individual is not profiting by his college course and would be more 
successful in some other line of activity. 

No. 11. 

The record of this student is quite inconsistent. Placed in 
the lowest quintile in the Thurstone test and competency rating, 
his test and academic ratings are well above the average. The low 
score in the first cylinder trial indicates a lack of intelligence, while 
the marked improvement on the second trial indicates good train- 
ability. The low rating on the Trabue test and idea span, contrasted 
with high ratings for the Courtis and Humpstone memory tests, 
suggest an efficient and retentive mind rather than a quick and 
imaginative one. That this man is a slow thinker is demonstrated by 
his score on the Thurstone test. The fact that he retains and digests 
the information which he acquires is evidenced by his high academic 
record. 

No. 12. 

This student displays a record consistently near the average for 
the group. The Thurstone and academic ratings are somewhat better 
than the mean, the competency rating is in the third quintile, and the 
test rating slightly below the average. Of the separate tests, seven 
are rated in the third quintile, a poor score on digit span and a very 
high rating on the Trabue test being the only significant scores. 



46 

On the whole, the oompeteney rating seems to express the ability of 
the student adequately. 

No. 13. 

The record in this case is oonostently high. Very good grades 
in the two courses in psychology substantiate an academic rating 
which is exceeded by onfy three members of the group. A competency 
rating of "5" and a Thurstone rating of ''4" correlate with a hi^ 
test rating. The only rating in the lowest quintile is that on the 
memory test and when this is contrasted with an exceptionally good 
performance with the cylinders, it seems reasonable to conclude 
that this student depends more on intelligence than on memory in 
his college work. Almost without exception ratings in the upper 
quintiles are displayed for the tests which do not stress language 
ability, while lower ratings are found where this factor is of great 
importance. 

No. 14. 

This record presents an interesting contrast with that of student 
No. 13 in that the intellectual rather than the intelligence factors 
are here stressed. While not quite so good from the academic view- 
point, this record shows a sli^tly hi^er rating for the Thurstone 
and other mental tests than does the preceding case. Ratings of 
"5" on the Ausfrage, digit span, Trabue, and memory tests indicate 
assodability, language ability and retentiveness, while a rating in 
the lowest quintile for the first cylinder trial implies comparatively 
poor intelligence. A much better record on the second trial with the 
cylinders shows trainability, which, coupled with a high memory 
span and good memory, pictures a student of more than average 
intellect. 

No. 15. 

The indication here is of a man of high general intelligence who 
does not care to apply himself to college work. On the one hand 
his academic rating is mediocre and he has obtained merely passing 
grades in psychology, while contrasted with this are Thurstone and 
competency ratings in the highest quintile, and a combined test 
rating well above the average. The low rating on the Courtis test 
is probably the only score of particular significance, and seems to 
indicate laziness and lack of interest. In view of the higher scores 
on the other tests this explanation may also hold for the low rating 
on definitions. On the whole the picture is that of a student with 
real ability who does not care to exert himself. 



47 

No. 16. 

In spite of a good rating on the Thurstone test, this record 
indicates an individual of somewhat less than average abiHty. 
Although the academic rating is fair, the competency rating and the 
composite test rating are both low. Ratings below the middle quin- 
tile are foimd for the Taylor number test, the digit and syllable 
spahSy the Trabue and definitions tests, while only the ratings for the 
Ausfrage, Ck)urtis and memory tests are better than the average. 
It seems likely that this student supplements good retentiveness 
with more than the usual degree of industry in passing his college work. 

No. 17. 

Thurstone and competency ratings in the lowest quintile com- 
bined with the lowest composite test rating of the group indicate 
decidedly inferior ability in this case. Eight of the separate test 
ratings are below the middle quintile and only three are above. 
Low ratings on the Thurstone, Taylor, Trabue, Courtis, opposites 
and cylinder tests, all of which involve a definite speed factor, sug- 
gest that a slow rate of discharge is primarily responsible for the 
poor test performances of this individual. High ratings in the 
digit span, description, and definitions tests, in all of which the time 
element is relatively unimportant, seem to bear out this conclusion. 
An observation of the scores of the three memory span tests shows 
that as the material becomes more complicated the rating is lower. 
This man evidently needs time to think, and does well when the time 
is not limited. This fact explains the lack of correlation between the 
test ratings and the academic record, which is at least average, 
and it also emphasizes the undue weight given to the time factor in 
meet mental tests. 

No. 18. 

This record displays the interesting combination of a very 
hig^ academic rating with mediocre performance in the various men- 
tal tests. The record is quite comparable with that of student 
No. 1 with the exception that in this case nine of the thirteen test 
results are found in the middle quintile. High ratings in the Thur- 
stone and Courtis tests suggest alertness, and this ability, in con- 
junction with a good rating on memory, may be partly responsible 
for the success in college work. It seems probable, however, that 
some motivation factor which cannot be measured by the test 
results has played an important part in the academic attainments 
of ibis student. 



48 

No. 19. 
In this case the record, with the exception of the grades in 
psychology, is consistently above the average. Low ratings on the 
Courtis and cylinder tests might suggest a slow rate of discharge 
were it not for a very high rating on the Thurstone test. High 
scores on the three memory span tests, the Trabue, definitions, and 
memory tests show associability, retentiveness, and language ability, 
which may be looked upon as essential factors in intellectual develop- 
ment. The low rating on the cylinders hardly seems significant in 
view of the other test results, although it may indicate a deficiency 
in mechanical as contrasted with mental ability. 

No. 20. 
This record provides an interesting comparison with that of 
student No. 19. Although the psychology grades, competency 
rating, and Thurstone rating are identical, this student has a some- 
what lower academic rating and a correspondingly lower composite 
test rating. Even the ratings for the separate tests show similar 
tendencies, but the scores for the Courtis and cylinder tests are lower 
here than in the preceding case. The most significant difference 
between the two records is found in the very low memory rating of 
this student, which places him definitely in the mediocre group. 

No. 21. 
This record is one of the most consistent to be f oimd in the group 
and places the student definitely in the fourth quintile. The academic 
rating is quite high, the Thurstone and competency ratings are both 
^^4", and the composite test rating is well above the average. The 
separate test scores indicate little, since all but two of the ratings are 
in the middle and upper quintile. Although the first cylinder trial 
was slow, the second trial compensated for this deficiency. There 
is no comment to make on this case other than a desire that mental 
tests might always correlate so closely with academic standing. 

■ 

No. 22 
While this record is, on the whole, mediocre, the academic 
rating is somewhat higher than might be expected in view of the 
low Thurstone and competency ratings. The latter may possibly be 
accounted for by the poor intelligence displayed in both cylinder 
performances, while good ratings on the tests requiring language 
ability, and particularly on the memory test, provide a satisfactoiy 
explanation for the fair academic rating. From the test re^ultis 
it seems probable that this individual has to apply himself to hia 
studies in order to do passing work. 



49 

No. 23. 
The failure in Psychology 2 is the only discordant note in an 
otherwise mediocre record. The composite test rating and that for 
the Thurstone test are about average for the group, while the com- 
petency rating is in the fourth quintile. The separate test results 
do not seem significant except for a high rating in memory. The 
poor work in psychology must probably be accoimted for by lack of 
interest or failure to study. 

No. 24. 
The record here is comparable with that of student No. 15 in 
that a high composite test rating is contrasted with a low academic 
rating. In this case, however, the discrepancy is even more marked. 
The test rating is exceeded by only four members of the group, while 
only two have poorer college records. The separate test results 
present no solution to the difficulty since the ratings are high with 
only one exception. The competency rating is "4". It seems 
probable that this man is not particularly interested in his college 
work and is expending most of his time and energy in some kind 
of outside activity. 

No. 25. 
In this instance the record is consistently mediocre. All four of 
the general ratings are either in the middle quintile or slightly below 
the group average. The failure in Psychology 1 is hardly to be 
accounted for by the separate test results, which display no definite 
tendency, and was probably due to lack of application, since the 
student was able to pass the second course. 

No. 26. 

The rather high academic rating in this case seems to contradict 
the low Thurstone and composite test ratings. The low digit span 
and the poor rating on the memory test indicate that this student 
must be a hard worker in order to have received such high grades 
for his college courses. Good trainability as displayed in the second 
trial with the cylinders may be a significant factor in his academic 
work. 

No. 27. 

In this case a very high score on the Thurstone test correlates 
well with a high composite test rating and a high academic rating. 
"Distinguished" grades in both courses in psychology also indicate 
general superiority. A poor performance on the second trial with the 
cylinders which resulted in a competency rating of only "3" is the 
only flaw in an otherwise excellent record. Eight of the thirteen 



50 

tests are rated above the middle quintile and indicate nothing more 
than an unusually high level of general intelligence. 

No. 28. 
This record offers an interesting comparison with that of student 
No. 27. The composite test ratings and the competency ratings are 
identical in the two cases, while the academic ratings are very nearly 
so. Both students received the highest grade in both courses in 
psychology. In this instance, however, the Thurstone score b 
mediocre, and the ratings for the Trabue and cylinder tests are in the 
second quintile. The ratings on those tests which stress language 
ability are generally higher than in the preceding case, while the 
memory spans are conspicuously lower. These facts indicate a 
relatively low intelligence coupled with a rather high intellectual 
development. On the whole, the student is decidedly superior to 
the majority of the group. 

No. 29. 
The record in this case must be considered consistently good 
although it can hardly be compared with either of the two preceding 
cases in general excellence. The academic rating shows a "G" 
average and the psychology grades rate the student even higher. 
While the Thurstone rating is ^'4", the rating on estimated com- 
petency is higher than that in either of the preceding records. This 
rating is not substantiated by the results of the separate tests, only 
four of which are found to be above the middle quintile. These 
seem to point to intelligence rather than to intellectual organization, 
although it would be unsafe to make any specific diagnosis. 

No. 30. 
This record displays a relatively high test rating and a Thur- 
stone rating in the fourth quintile contrasted with an average 
academic rating and imsatisfactory grades in psychology. While 
the separate test scores indicate somewhat erratic performances, 
very high ratings on the memory and cylinder tests show that this 
individual has unusual ability in some directions. It seems probable 
that lack of interest or want of application is responsible for the 
deficiency in psychology. 

No. 31. 

The mediocre composite test rating in this case does not corre- 
late with the generally high level of the other ratings, all of which 
are in the fourth quintile. Although the separate test results are 
distributed through the five quintiles, they show no definite ten- 
dency which might be considered explanatory. Possibly the high 



61 

degree of trainabOity displayed in the second cylinder trial is sig- 
nificant, but it seems likely that this student either did not take the 
tests seriously or that some strong motivation factor has entered into 
his college work. 

No. 32. 
This record presents as great a contradiction as is to be found 
in the whole group. While only two students have academic records 
which exceed the rating in this case, only three have lower composite 
test ratings. Moreover, the estimated competency rating is "2" 
and the Thurstone test rating ''5". Only three students have better 
grades in the two psychology courses. In the separate tests, low 
ratings were received on the Taylor number, digit span, Trabue, 
differences, definitions, and second cylinder triaL Only the pliable 
span and memory tests were rated higher than the middle quintile, 
the latter receiving the only ''5" of the series. It seems hardly 
possible to explain the excellent academic record on the basis of good 
memory alone, and the only conclusion which can be reached is that 
the test results do not reflect the evident competency of this student. 

No. 33. 

All things taken into consideration, this is the poorest record in 
the group. The academic rating is low and one of the courses in 
psychology was not passed. The competency rating and that on 
the Thurstone test are both in the lowest quintile, the score on the 
latter test being the lowest made by any of the fifty students. The 
composite test rating is one of the lowest in the group, and only two 
of the separate test results are placed above the middle quintile. 
A rating of "5'' in the memory test suggests that this ability may 
have enabled the student to stay in college. Low ratings on the 
Taylor, digit span, E^llable span, Trabue, differences, opposites and 
definitions tests and the first trial with the cylinders indicate a very 
general deficiency. The test results in this case are quite similar 
to those in the record of student No. 32, but seem here to be really 
significant. 

No. 34. 

In this instance, the various ratings of the record correlate 
well to show better than average competency. The academic rating 
is good, the psychology grades very good, and the competency rating 
is in the fourth quintile. The Thurstone score is high, and while the 
composite test rating is only fair, the separate test results show no 
marked deficiencies. Low ratings in the Ausfrage and Taylor tests 
are not particularly significant, while higher ratings in the cUgit span, 
Courtis, and memory tests and second cylinder trial indicate asso- 



52 

dabilityi speed, retentiveiiess and trainability. On the whole, the 
record shows no contradictions. 

No. 36. 
This record is consistent in so far as the composite test rating, 
the Thurstone rating and the competency rating are concerned. 
The test rating is equaled only by student No. 2, and both of the other 
ratings place this student in the highest quintile. In academic work, 
however, only an average rating is to be found, and the explanation 
must probably be based on lack of interest in studies or absorption 
in other activities. High ratings on the Taylor number, digit span, 
Trabue, Courtis, definitions and memory tests, on both trials with 
the cylinders, and on the Thurstone test indicate that this student 
has the ability to do excellent college work if he so desires. 

No. 36. 
Although a number of the separate test results are nniflft^ng in 
this record, the ratings on the Thurstone test and estimated com- 
petency as well as the composite test rating indicate a rather high 
level of mentality. The academic rating, however, is one of the 
lowest in the group and shows that conditions and failmres were 
received in a number of coiu'ses, even though the work in psychology 
was somewhat above the average. The evidence seems fairly con- 
clusive that this man could do better college work if he wished to 
apply himself. Interest in outside activities probably explains the 
discrepancy between the test ratings and the academic record. 

No. 37. 

With the exception of a low rating on the Thurstone test, this 
record is consistently mediocre. The academic rating, competency 
rating and composite test rating all appear in the middle quintile. 
A good performance on the first trial with the cylinders, followed 
by an excellent second trial, indicate intelligence and tralnability, 
while a low rating on the memory test may explain the mediocre 
college record. 

No. 38, 

The record in this instance is consistently below the average 
for the group and may be considered typical of the second quintile. 
The academic rating is low, the psychology grades merely passing, 
the competency rating and the Thurstone rating are both ^*2*\ and 
the composite test rating decidedly below the average. Low ratings 
were received on the Taylor nmnber, idea span, description, and 
Trabue tests, while the digit span, definitions and cylinder tests 
were rated above the middle quintile. No ratings in the highest 



53 

quintile appear. An analsrsis of these results seems to indicate 
good assodability and intelligence coupled with rather deficient 
intellectual organization. This man would probably be more 
successful in business than in an academic or professional vocation. 

No. 39. 

This record disputes with that of student No. 33 the distinction 
of being the poorest in the group. The fact that the student was 
excluded from the University at the end of the session gives peculiar 
interest to this case. An observation of the grades received in college 
courses discloses the significant fact that eight units of work were 
assigned a grade of "D", while an equal number received a "G". 
Eight imits of credit were merely "Passed", conditions were given 
for three units, and the remaining eight units received the grade 
"F". Passing grades were assigned for both courses in psychology. 
This unusual distribution of grades suggests specific ability along 
certain lines with marked variations in interest. The student would 
probably have received ''Distinguished" grades in all of his college 
work if he had been allowed free election of courses. Low ratings on 
the Thurstone and Courtis tests show that he cannot think quickly, 
while poor scores in the Trabue and memory tests indicate deficiency 
in imagination and retentiveness. High ratings on the Taylor 
number and cylinder tests show that there is no deficiency in the rate 
of discharge of energy, and that distribution of attention and ifitelli- 
gence are both above the average. It seems probable that this man, 
now being free to follow his own inclinations, will be successful in the 
vocation which he chooses. The case is particularly interesting as an 
e3cample of the influence of special abilities and of motivation in the 
behavior of the individual. 

No. 40. 

In this case a very low competency rating is contradicted by 
a composite test rating only slightly below the average and Thur- 
stone and academic ratings in the fourth quintile. The competency 
rating was doubtless influenced by very poor performances in both 
cylinder trials, but this deficiency in intelligence is compensated for 
by high ratings in the syllable and idea spans, Courtis, definitions 
and memory tests. In other words this student has the assodability, 
alertness, language ability and retentiveness necessary to do good 
college work. It is possible, also, that lack of interest in the tests 
may have affected the significance of the results. 

No. 41. 
This is a consistently mediocre record with the exception of 
the psychology grades, which are slightly above the average, and 



54 

the eompeteney rating, which is very high. The academic rating is 
dightly below the median and the composite test rating is median 
for the group. The Thurstone score is placed in the middle quintile. 
High ratings are shown for the Ausfrage, Courtis, and first cylinder 
trial. The latter, however, is offset by a very poor performance in 
the second trial with the cylinders. Low ratings also appear for the 
Taylor number, digit and pliable spans, Trabue, and memory tests. 
These results indicate rather poor general intelligence and suggest 
that the competency rating is too high. 

No. 42. 

Every one of tiie principal ratings in this record occurs in the 
highest quintile, and the student must be ranked definitely with the 
leaders of the group. High ratings on the Ausfrage, description, 
Trabue, and memory tests and on both cylinder trials show good 
observation, imagination, retentiveness and intelligence. A low 
rating on the digit span is neutralized by a high idea span. Other 
low ratings on the Ck)urtis and opposites tests do not seem significant. 
On the whole the record is unusually consistent and justifies the high 
competency rating. 

No. 43. 

Although the composite test rating in this case is about average 
for the group, the academic rating is decidedly inferior. The com- 
petency rating is the lowest given to any member of the class, and 
is based on very poor performances with the cylinders. Although 
this student seems to lack intelligence, high ratings were obtained in 
the Ausfrage, Trabue, and definitions tests. Low ratings for the 
Taylor number, digit span, and Courtis tests indicate a consistently 
poor performance in those tests which do not involve language ability. 
The good ratings in the strictly intellectual tests suggest that outside 
activities are responsible for the low academic rating. 

No. 44. 

This record seems to be typical of the fourth quintile. The 
academic record shows a preponderance of "Good'' grades, and this 
mark was received for both courses in psychology. The competency 
rating is ''4" and the Thurstone rating "3". The composite test 
rating is one of the best in the group, although fifth quintile ratings 
appear only for the Ausfrage test and the first cylinder trial. Other 
test ratings show a high level of general intelligence with no sig- 
nificant disabilities. 



55 

No. 46. 
An academic record in the fourth quintile is accompanied in 
this case by a Thurstone score in the middle quintile, a competency 
rating in the second, and a composite test rating in the lowest quintile 
of the group. This unanimous absence of correlation is also shown 
in the separate test results where ratings in all five quintiles appear. 
A high rating on the Taylor number test suggests good distribution 
of attention, but even this ability must have been lacking in the 
cylinder performances. The test results show no definite tendency, 
but display a low level of general intelligence. The high academic 
rating notwithstanding, this student falls below the middle quintile 
of the group in competency. 

No. 46. 
The lowest academic rating in the group is displayed by this 
senior, who, nevertheless, was able to graduate with his class. While 
the Thurstone score is poor, the competency rating and the com- 
posite test rating are both high. The separate test results are low 
for digit and idea spans, but high for most of the other tests with 
exceptionally good performances on the cylinder test. This student 
was evidently doing no more college work than was necessary to ob- 
tain his degree, and was probably interested in outside activities. 

No. 47. 
The competency rating, composite test rating, and academic 
rating agree in placing this student in the second quintile. The 
rating on the Thurstone test is very high, and the grades in psychology 
the poorest in the group, consisting of an ''N" for the first course 
and a "Failure" for the second. High ratings on the Thurstone 
and Courtis suggest a rather quick mind when familiar operations 
are involved, while the very low ratings on the cylinder test indicate 
inability to meet a new problem successfully. Since the subject- 
matter of the courses in psychology is quite unlike that of most 
college courses, the inability of the student to adapt himself to the 
new situation is probably the cause of his deficient work in this 
subject. Although the result for the memory test is missing, a high 
rating in that ability may be predicted. 

No. 48. 
In this record the composite test rating, the competen<^ rating 
and the Thurstone rating indicate a very high level of general intelli- 
gence. The academic rating, however, is fkr below the average for 
the group. Of the separate test results, only two fall below the middle 



56 

quintile. The low ratings on the Trabue and Courtis tests are diffi- 
cult to explain in the light of the other test ratings, five of which are 
in the highest quintile. Excellent assodability, language ability, 
retentiveness, and intelligence are displayed in the various test scores, 
and the only explanation of the relatively poor college grades seems 
to lie in lack of interest or absorption in outside activities. 

No. 49. 
Although the Thurstone score, the competency rating, the com- 
posite test rating, and the grades in psychology agree in placing this 
student below the middle quintile, the academic rating is the median 
for the group. As is frequently the case where this situation is 
encountered, the rating on the memory test is high. In addition to 
this test only the Ausfrage and the syllable span were rated higher 
than the middle quintile, while eight of the thirteen tests fell below 
that level. It seems certain that more than the usual amoimt of 
industry is expended by this individual on his coUege work. 

No. 60. 
This record is quite similar to that of student No. 40 with the 
exception that the composite test rating is slightly lower and the 
academic rating somewhat higher than in the preceding case. Here, 
however, the Thurstone rating is high and the competency rating 
and psychology grades average. Of the separate test results, only 
the rating on the memory test is in the highest quintile. The ratings 
for the Taylor number, description, Trabue, Courtis and first cylinder 
tests are in the lowest quintile. The second trial with the cylinders 
indicates good trainability, which with the assistance of an unusually 
good memory may accoimt for the high academic rating. On the 
other hand, lack of effort in the tests may be responsible for the low 
composite test rating, and is suggested by the high score on the 
Thurstone test. 

SufnmaTy. 

A scrutiny of the analyses of the fifty individual records shows 
that these may be separated into two general groups. In twenty-six 
cases the correlation between the various ratings is close enough to 
present fairly conclusive evidence of the relative performance level 
of the student. These cases, in turn, naturally fall into five classes 
corresponding roughly with the points of a five-division scale, which 
may be referred to here as very good, good, medium, poor, and very 
poor. Seven records are so consistently high as to warrant a plaos 
in the first group, while five more are distinctly better than the 
average and may be considered ''good". Eight cases occur in the 



67 

''medium" class, and of the six which fall below this level two are 
"poor" and four show such a general inferiority as to justify place- 
ment in the lowest group. The twenty-four remaining records, 
which display a decided lack of correlation between the various 
major ratings, exhibit two opposing tendencies. In fourteen cases 
the academic rating is higher than would be predicted from the test 
results, while in the ten remaining cases the Thurstone score, com- 
petency rating and composite test rating would seem to indicate 
better scholastic ability than is displayed in the academic rating and 
psychology grades. The following summary shows the classification 
of each individual record. 

Classification of Individual Records. 

I. Cases showing general correlation of ratings: 

Very good 2, 13, 14, 27, 28, 42, 44 

Good 11,18,19,21,29 

Medium 4, 5, 6, 20, 23, 37, 40, 41 

Poor 22,47 

Very poor 3, 33, 38, 39 

II. Cases where correlation is lacking: 

High academic, medium mental 1, 9, 12, 31, 34 

High academic, low mental 26, 32, 45 

Medium academic, low mental 8, 10, 16, 17, 49, 50 

High mental, medium academic 15, 30, 35 

High mental, low academic 24, 36, 46, 48 

Medium mental, low academic 7. 25, 43 

Although in some cases the evidence is not so clear cut as the 
summary above may seem to indicate, the classification nevertheless 
is justified by the data at hand. It also seems reasonable to attribute 
the absence of correlation shown in the second group of records to 
variations in motivation and other external factors which have not 
as yet lent themselves to quantitative measurement. Of two men 
who have the same composite test rating and who may be assumed 
to possess equal competency, one may be intensely interested in his 
studies and impelled by a consuming ambition to gain the greatest 
possible benefit from his college course, while the other is content to 
do only the amoimt of work necessary to fulfil the minimum scholastic 
requirements and seeks to excel in athletic or social activities. Again, 
the first student may be devoting all of his time and effort to college 
work, while the second is compelled to expend much of his energy 



58 

in supporting himself. Certainly no series of mental tests will 
eonelate dosdy with academic standing until some satisfactory 
method of evaluating these factors external to competency has been 
devised. At present it is possible to do no more than call attention 
to the lack of correlation and attempt to explain the discrepancies 
in the most logical manner. 

Conclusions. 

(1) The p^chologist should engage in the analysis and evalu- 
ation of the ** ability " components of the college student's competency 
rather than in the correlation of general intelligence tests with aca- 
demic grades. 

(2) The abilities required for scholastic success, under the 
present methods of college instruction and grading, are not all of the 
abilities comprising individual competency. Hence the failure of 
test results to correlate with college grades. The better the general 
intelligence test, the smaller will be the correlation with academic 
standing. 

(3) College grades will provide more satisfactory material 
for statistical treatment when each institution adopts a standard 
distribution of grades and provides for supervision by some adminis- 
trative officer. 

(4) Tests for college students must be devised which place less 
dependence upon time measurement, which have a higher reliability 
coefficient, and which are of greater difficulty, than most of the tests 
now available. 

(5) Motivation and environmental and economic conditions 
have not as yet yielded to quantitative treatment. Until they do, 
it will not be possible to predict with accuracy the success of a student 
in college or in any other field of endeavor. 

(6) Test ratings such as those presented here should be made 
available to deans, faculty advisers, and committees dealing with 
scholastic deficiency. In many instances this information would 
be of value to the student, also, providing him with educational or 
vocational guidance. 

(7) A "follow up" of the fifty students who have provided the 
material for this study will be published at some future date. 

(8) Only after many investigations are at hand with diagnoses 
carefully followed up over a period of years will psychological diag- 
nosis and orthogenic guidance become as reliable for the normal 
individual as it is now for the subnormal. 



59 

BIBLIOGRAPHY, 

1. WifiSLSB, Clark. The Correlation of Mental and PhjnEdcal Tests. 
Psychological Review Monograph Supplement S, lOOl, No. 6| 1-61. 

2. Calfes, M. College Freshmen and Four General Intelligence Tests. 
Journal of Educational Psychology, 1913, 4, 223-231. 

3. Rowland, E., and Lowdbn, G. Report of Psychological Tests at Reed 
College. Journal of Experimental Psychology, 1916, 1, 211-217. 

4. Waugh, Kabl T. a New Mental Diagnosis of the College Student. 
New York Times Magaevne Supplement, January 2, 1916. 

5. KrrsoN, H. D. Scientific Study of the College Student. Psyehologieal 
Monograph No. 98, 1917. Pp. 81. 

6. King, I., and McCrort, J. Freshmen Tests at the State University of 
Iowa. Journal of Educational Psychology, 1918, 9, 32-46. 

7. Caldwell, H. H. Adult Tests of the Stanford Revision Applied to 
College Students. Journal of Educational Psychology, 1919, 10, 477-488. 

8. RoGSRS, A. L. Mental Tests as a Means of Selecting and Classifjring 
College Students. Journal of Educational Psychology, 1920, 4, 181-192. 

9. Httmpbtdnb, H. J. Some Aspects of the Memory Span Test. Experi- 
mental Studies in Psychology and Pedagogy, 7. Psychological Clinic Press, 
Philadelphia, 1917. Pp. 31. 

10. Tbrican, L. M. The Measurement of Intelligence. Houghton-Mifflin 
Company, Cambridge, 1916. Pp. 362. 

11. TRABtng, M. R. Completion-Test Language Scales, Teachers College, 
Columbia University, 1916. Pp. 118. 

12. Courtis, S. A. The Courtis Standard Tests. Department of Co-oper- 
ative Research, Detroit, 1914. Pp. 125. 

13. Whipplb, G. M. Manual of Mental and Physical Tests, Part II. 
Warwick and York, Baltimore, 1915. Pp. 336. 

14. Paschal, F. C. The Witmer Cylinder Test. The Hershey Press, 
Hershey, Pa., 1918. Pp. 64. 

15. Finxslstbin, I. E. The Marking System in Theory and Practice. 
Warwick and York, Baltimore, 1913. Pp. 87. 



yc 03817 




5on:?r,i 






.H.4 




UNIVERSITY OF CAUFORNIA UBRARY 



^ 



r 




